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ABSTRACT 


1 

A syntactic  approach  is  applied  to  the  shape  description  and  recog- 
nition. The  structure  of  a shape  is  described  by  grammatical  rules  and 
the  local  details  by  primitives.  Four  attributes  are  proposed  to 
describe  an  open  curve  segment,  and  the  angle  between  two  consecutive 
curve  segments  is  used  to  describe  the  connection.  The  property  of  the 
attributes  and  the  recognition  capability  of  this  method  are  studied. 
The  primitive  extraction  and  syntax  analysis  can  be  performed  in  the 
same  step  by  using  both  semantic  and  syntactic  information,  namely,  the 
attributes  and  production  rules. 

^/pjhe  recognition  system  Iim  been  rj  impl emented  and  tested  on  the 
recognition  of  airplane  shapes.  j;he  performance  is  quite  satisfactory 
with  respect  to  accuracy  and  computational  efficiency.  The  method  is 
extended  to  recognize  partially  distorted  shapes.  The  distorted  portion 
of  the  shape  can  be  measured  in  terms  of  error-weight.  The  class  mem- 
bership functions  of  different  shapes  and  the  error-weight  estimation  of 
the  distorted  portions  are  included  in  the  extended  recognition  algo- 
rithms for  recognizing  noisy  and  distorted  shape  patterns.  The  success 
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of  this  extension  shows  the  advantage  of  the  syntactic  approach  using 
attributed  grammars  over  other  existing  shape  recognition  methods.  •*-' 

The  grammatical  inference  procedures  for  shape  grammars  are  also 
developed.  The  shape  grammars  can  be  inferred  automatically,  interac- 
tively or  manually  directly  from  the  noisy  vector  patterns. 

Although  this  approach  has  only  been  tested  on  airplane  shapes,  the 
results  are  also  applicable  to  more  general  shape  analysis  problems. 
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CHAPTER  1 

INTRODUCTION 


Shape  Recognition 

Since  the  early  1960's,  pattern  recognition  has  received  increasing 
attention  because  of  the  utilization  of  digital  computers.  The  recogni- 
tion techniques  can  be  applied  to  many  pictorial  data,  such  as  charac- 
ters, machine  parts,  fingerprints,  aerial  reconnaissance  pictures,  bub- 
ble chamber  photographs,  chest  radiographs,  blood  cells,  chromosome  im- 
ages, etc.  The  pattern  of  interest  may  be  an  object,  a disease,  or  a 
phenomenon.  The  picture  may  provide  information  by  its  shape,  texture, 
and/or  color.  Shape  seems  to  be  the  most  important  of  these  in  many 
recognition  applications.  For  instance,  characters  are  recognized  pure- 
ly by  their  shapes.  Machine  tools  and  parts  can  also  be  recognized  by 
their  shapes.  In  our  everyday  life  experience,  we  also  find  that  most 
of  the  time  we  can  identify  an  object  only  by  its  shape. 

Shapes  are  usually  described  by  their  skeleton,  contour,  symmetry, 
or  other  numerical  features.  The  skeleton  which  describes  the  structure 
of  an  object  can  be  obtained  from  the  contour  [17,333.  Symmetry  or  oth- 
er numerical  features  are  usually  derived  from  the  contour 
C18, 20, 25, 553.  Sometimes,  the  region  inside  the  contour  is  also  used  to 
extract  information  for  recognition  [63,  and  the  inside  region  is  obvi- 


ously  dependent  on  the  contour.  For  patterns  which  are  very  similar  in 
structure,  the  boundary  details  may  be  needed  for  discrimination.  The 
contour  provides  structural  information  as  well  as  boundary  detail  in- 
formation. The  recent  papers  C50,55D  by  Davis  also  suggested  describing 
and  understanding  a picture  by  characterizing  the  angles,  sides,  and 
symmetry  of  the  outlines.  This  implies  that  the  outer  boundary  is  very 
informative  in  describing  and  recognizing  an  object  in  the  two  dimen- 
sional  image.  Therefore,  we  concentrate  our  interests  on  object  iden- 
tification using  shape  information  and  we  define  shape  as  the  outer  con- 
tour outlining  the  object  in  the  two  dimensional  image. 

2-2  A Syntactic  Approach  to  General  Shape  Recognition 

Over  the  past  years,  many  methods  for  shape  description  and  recog- 
nition have  been  studied.  Most  of  them  will  be  discussed  extensively  in 
Chapter  2 under  three  categories:  template  matching  C3D,  statistical 
methods  Cl, 2, 3D,  and  syntactic  methods  C4,SD.  To  recognize  a picture  by 
straightforward  template-matching,  the  machine  has  to  memorize  a large 
number  of  templates.  For  shapes  which  differ  in  structure,  we  can  store 
the  skeletal  templates  in  the  machine  and  match  the  skeletons  to  recog- 
nize a shape.  But  the  methods  for  finding  the  skeleton  are  either  too 
sensitive  to  noise  or  not  sensitive  enough  to  the  detailed  variations  in 
the  boundary. 

To  apply  the  statistical  methods  to  picture  recognition,  we  need  to 
select  features  which  can  reflect  the  differences  between  classes  and 
are  not  sensitive  to  noise.  The  feature  space  can  be  partitioned  into 
regions  corresponding  to  classes.  An  unknown  pattern  is  then  recognized 
by  locating  its  extracted  features  in  the  feature  space.  A pattern 
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whose  extracted  features  are  found  in  a region  in  the  feature  space  is 
recognized  as  the  corresponding  class.  The  statistical  methods  are  ef- 
fective, if  proper  features  can  be  selected.  Unfortunately,  it  is  usu- 
ally difficult  to  select  good  features.  Fourier  descriptors  and  moment 
invariants  are  good  features  for  the  recognition  of  rigid-body  objects. 
When  parts  of  the  object  are  covered  or  touched  by  other  objects,  parts 
of  the  shape  may  be  badly  distorted,  while  the  rest  of  the  shape  may  be 
left  unaltered.  This  type  of  distortion  may  affect  the  feature  values 
so  badly  that  the  recognition  fails. 

The  syntactic  approach  essentially  breaks  a shape  into  simpler 
subshapes.  The  production  rules  describe  the  relationships  among  the 
subshapes.  The  simplest  shape  elements  are  defined  as  primitives  and 
described  by  attributes.  An  unknown  shape  pattern  is  then  recognized  by 
recognizing  its  primitives  and  analyzing  the  syntax  among  the  primitives 
according  to  the  grammatical  rules.  The  syntactic  approach  seems  to  be 
the  most  promising  approach,  because  it  uses  grammatical  rules  to 
describe  the  shape  structure  explicitly  and  primitives  to  describe  the 
boundary  details.  Other  existing  syntactic  methods  are  developed  for 
particular  applications.  They  use  specially-defined  primitives  for  par- 
ticular pattern  classes.  The  semantic  inforration,  the  attributes,  are 
used  only  in  the  primitive  extraction  stage,  and  the  syntactic  informa- 
tion, the  production  rules,  are  used  in  the  parsing  or  syntax  analyzing 
stage.  We  intend  to  develop  a method  which  fully  utilizes  the  syntactic 
and  semantic  information  to  solve  a general  class  of  shape  analysis 
probl ems . 
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In  defining  primitives  for  general  shapes,  we  must  try  to  avoid  re- 
quiring a context-sensitive  grammar  to  describe  a shape,  because 
context-sensitive  grammars  are  very  difficult  to  parse.  If  the  primi- 
tives are  very  simple  curve  segments  and  are  fixed  in  length,  then  we 
may  need  context-sensitive  grammars  to  handle  the  size  problem.  In 
fact,  the  complexity  of  primitives  and  production  rules  are  inversely 
related.  We  may  use  complex  production  rules  for  simple  primitives  or 
vice  versa.  In  order  to  minimize  the  overall  complexity,  we  may  use 
primitives  which  are  sophisticated  enough  to  avoid  context-sensitive 
grammars,  but  are  characterized  by  a rather  small  number  of  features,  or 
attributes.  The  attributes  can  be  considered  as  carrying  the  semantic 
information  of  a primitive.  To  fully  utilize  the  syntactic  and  semantic 
information  to  obtain  an  optimal  solution,  we  employ  the  attributed 
grammar  to  describe  and  recognize  general  shapes. 

J_.,3  Summary  of  the  Contents 

Chapter  2 is  an  extensive  survey  of  the  existing  methods  for  shape 
description  and  recognition.  Our  proposed  syntactic  method  is  described 
and  discussed  in  Chapter  3.  We  started  our  study  with  the  geometrical 
analysis  of  curve  segments  in  a continuous  case.  We  found  that  four  at- 
tributes are  sufficient  to  describe  a simple  curve  segment.  And  they 
can  be  transformed  into  four  variables  which  are  invariant  with  respect 
to  translation,  rotation,  and  scaling  of  the  shape.  Computation  of  the 
attributes  in  a discrete  case  are  obtained.  The  relevant  properties  are 
discussed.  It  is  interesting  that  the  four  attributes  can  also  charac- 
terize a nonterminal  curve  segment.  Therefore,  every  nonterminal  or 
primitive  has  attributes.  Also  for  each  production  rule  there  is  a set 
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of  attribute  rules  to  relate  the  attributes.  These  properties  satisfy 
the  definition  of  an  attributed  grammar  [5,443.  Uith  attributed  gram- 
mars, we  actually  describe  and  recognize  shapes  by  using  both  syntactic 
and  semantic  information.  In  fact,  both  syntactic  and  semantic  informa- 
tion can  be  used  in  the  same  step,  the  primitive-extraction-embedding 
parsing,  which  performs  the  primitive  extraction  and  parsing  together. 
This  integrity  reduces  the  errors  in  primitive  extraction  and  their  con- 
sequent difficulties. 

Chapter  4 describes  an  implementation  of  this  method  with  aircraft 
shape  recognition.  This  implementation  is  designed  to  demonstrate  that 
the  proposed  syntactic  method  is  capable  of  discriminating  shapes  by 
structure  as  well  as  by  boundary  detail.  The  process  of  pictorial  pat- 
tern recognition  usually  consists  of  three  steps:  preprocessing, 
description,  and  recognition.  (See  Figure  1.1.)  In  our  method,  the 
primitive-extraction-embedding  parsing  practically  combines  the  latter 
two  steps.  In  the  first  step  of  preprocessing,  many  operations  includ- 
ing digitization,  noise  reduction,  object  detection,  scene  segmentation, 
etc.  may  be  performed.  Since  we  are  interested  in  using  the  shape  in- 
formation, we  assume  that  the  shape  or  contour  can  be  easily  obtained 
from  the  picture.  In  our  experiments,  the  preprocessing  stage  includes 
digitization  thresholding,  boundary  following,  and  smoothing.  It  then 
outputs  a shape.  The  shape,  a chain  of  vectors,  is  subsequently  recog- 
nized by  a parsing  program.  All  these  programs  are  written  in  FORTRAN. 

The  recognition  schemes  are  extended  to  recognize  some  badly  dis- 
torted shapes  in  Chapter  5.  Distorted  shape  recognition  is  an  important 
advantage  of  the  syntactic  method  over  other  existing  methods.  We  gen- 
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eralized  the  recognition  functions  for  primitives  and  non-terminals,  and 
utilize  the  error-correcting  technique  in  the  primit ive-extraction- 
embedding  parser,  so  that  the  parser  can  fully  use  the  context  informa- 
tion to  correctly  segment  the  boundary  chain  and  compute  the  percentage 
of  boundary  that  is  distorted.  Several  modified  versions  of  Earley’s 
parsing  algorithm  are  described  and  discussed  to  show  the  feasibility  of 
the  ideas.  Some  experimental  results  demonstrate  the  power  of  recogniz- 
ing distorted  shapes.  Unfortunately,  these  modified  algorithms  have  the 
disadvantages  of  large  storage  and  long  computational  time. 

Chapter  6 is  devoted  to  grammatical  inference.  The  shape  grammars 
can  be  constructed  manually,  interactively,  or  automatically.  The  major 
contents  include  an  automatic  learning  algorithm  ALA,  an  interactive 
learning  algorithm  ILA,  and  the  conversion  from  context-free  shape  gram- 
mars to  finite-state  shape  grammars.  Both  ALA  and  ILA  are  designed  to 
infer  shape  grammars  from  noisy  boundary  vector  chains. 

Chapter  7 summarizes  the  results  of  this  study  and  proposes  sugges- 
tions for  future  research. 
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CHAPTER  2 


SURVEY  OF  SHAPE  DESCRIPTION  AND  RECOGNITION 


Shape  has  been  an  interesting  research  topic  for  more  than  twenty 
years.  Attneave  [493  focused  his  attention  on  the  angles  of  the  shapes 
in  1954.  Davis  [50,553  studied  the  methods  for  detecting  angles,  sides, 
and  symmetry.  Pavlidis  [513  and  Ramer  C523  tried  to  approximate  plane 
curves  by  straight  lines.  We  concentrate  our  interests  on  using  shape 
information  for  recognition  purposes. 

As  mentioned  in  Chapter  1,  the  description  and  recognition  of 
shapes  are  two  consecutive  steps  in  the  pattern  recognition  system.  The 
recognition  or  classification  is  very  much  related  to  the  description 
method.  Many  methods  have  been  proposed  to  describe  and  recognize  ob- 
jects by  shapes.  Most  of  them  will  be  discussed  in  the  following  sec- 


tions. 


Tempi  ate  Matching 

Template  matching  [3,73  or  prototype  matching  is  one  of  the  most 
primitive  methods  of  pattern  recognition  and  is  still  used  in  many  prac- 
tical applications.  This  method  can  be  described  as  follows.  Some  tem- 
plates or  prototypes  with  known  classification  are  stored  in  the 
machine.  The  machine  matches  the  unknown  input  pattern  to  the  proto- 
types, calculate  the  differences  with  a difference  function,  and  then 
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assigns  the  unknown  to  the  closest  class.  For  fixed  and  restricted 
shapes,  such  as  the  recognition  of  characters  printed  by  the  same 
machine,  this  method  is  simple  and  useful.  If  the  input  patterns  have 
many  varieties  for  each  class,  the  machine  has  to  memorize  a large  num- 
ber of  prototypes.  Sometimes  one  difference  function  may  not  work,  un- 
less all  the  possible  prototypes  are  stored  in  the  machine. 

For  example.  Figures  2.1(a)  and  2.1(b)  are  two  prototypes  of 
numeral  "8"  and  "3".  Let  us  use  the  number  of  different  pixels  as  the 
difference  calculation  function.  The  difference  between  Figures  2.1(c) 
and  2.1(a)  is  32,  while  that  between  Figures  2.1(c)  and  2.1(b)  is  23. 
And  the  difference  between  Figures  2.1(d)  and  2.1(a)  is  22,  while  that 
between  Figures  2.1(d)  and  2.1(b)  is  28.  The  misc l assi f i cat  ion  is  obvi- 
ous. It  is  not  easy  for  this  method  to  handle  small  changes  in  the  in- 
put pattern,  even  though  the  change  seems  trivial  to  a human.  However, 
this  method  is  practically  used  in  commercialized  machines,  such  as  po- 
stal sorting  machines,  credit  card  bookkeeping  machines,  etc.  The  main 
reason  for  this  is  that  this  method  can  be  implemented  optically  [73  to 
achieve  very  high  speeds.  In  order  for  the  machine  to  recognize  objects 
more  intelligently,  more  sophisticated  matching  techniques  such  as  re- 
laxation [563  were  developed. 

In  some  cases,  classes  are  different  only  in  structure.  The  skele- 
ton is  used  to  describe  and  recognize  a shape.  Many  skeletons  are 
stored  in  memory  as  templates.  For  an  unknown  pattern,  the  skeleton  is 
extracted  and  matched  to  the  template  skeletons  for  classification. 
There  are  many  techniques  of  extracting  the  skeleton  of  <a  shape  such  as 
medial  axis  transformation  (MAT)  [33,343,  polygonal  approximation  [293, 
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and  Fourier  transformation  [173.  MAT  was  the  first  studied  technique 
but  it  is  computationally  expensive  and  sensitive  to  noise.  Polygonal 
approximation  and  Fourier  transformation  are  insensitive  to  noise,  but 
also  insensitive  to  boundary  details.  A skeleton  is  usually  represented 
as  a graph. 

2_.2  Stati stical  Method 

Statistical  pattern  recognition  [1,2,33  has  developed  rapidly  since 
the  utilization  of  computers  in  the  early  1960's.  For  processing  pic- 
torial data,  the  statistical  pattern  recognition  system  can  be  described 
as  in  Figure  2.2. 

We  first  select  the  important  features  which  can  well  characterize 
the  pattern  and  teach  the  machine  how  to  extract  these  features.  After 
the  preprocessing,  the  machine  extracts  the  feature  values  from  the  unk- 
nown input  pattern  and  uses  them  for  classification.  These  selected 
features  should  be  sensitive  to  different  classes  yet  insensitive  to 
noise  and  changes  of  input  patterns.  But,  they  usually  depend  upon  the 
particular  application.  The  features  often  used  will  be  described  in 
following  subsections. 

2.2.1  Topological  Features  [33 

Shape  is  sometimes  defined  to  be  more  complicated  than  simply  the 
outer  contour  of  the  object.  That  is,  the  shape  may  contain  many  re- 
gions. If  the  image  is  twisted  or  the  object  is  deformed,  the  appear- 
ance will  be  different.  But  there  are  some  topological  properties  which 
are  invariant  with  respect  to  any  planary  stretching  deformation.  They 
are  the  number  of  connected  components,  C,  the  number  of  holes,  H,  and 


* 


hence  the  Euler  number,  E = C-H  C33. 

For  pictures  of  straight  line  boundaries,  known  as  polygonal  net- 
works, the  number  of  vertexes,  sides,  and  faces  are  also  important  meas- 
urements. Greamas  C4 53 , Munson  [46]  applied  these  properties  to  charac- 
ter recognition.  Yokoi  et.  al  . [47]  did  further  analysis  on  binary  pic- 
tures. Rosenfeld  C483  studied  the  converse  to  the  Jordan  curve  theorem 
for  digital  curves. 

The  topological  properties  are  useful  for  stretchable  objects  and 
rubber-sheet  images,  but  they  cannot  fully  describe  a shape. 

2.2.2  Geometrical  Features 

A quantitative  way  to  describe  the  irregularity  of  a shape  is  to 
measure  the  variation  of  the  boundary,  which  can  be  represented  by  a 
chain  of  vectors.  The  most  common  mechanisms  measuring  this  variation 
use  the  length  of  the  vectors,  the  angles  between  the  vectors,  and  the 
distance  (radii)  from  the  centroid  of  the  shape  to  the  boundary.  For  a 
regular  shape,  there  should  be  little  variation  in  these  measurements, 
but  with  increasing  irregularity  there  will  be  greater  variation. 
Attneave  C 1 43  found  that  the  number  of  turns,  the  angular  variability 
and  the  appearance  of  symmetry  are  correlated  with  human  judgement  of 
the  complexity  of  a shape.  Lee  [543  developed  a dissimilarity  measure- 
ment of  polygons  from  the  dissimilarity  of  two  triangles.  Dissimilarity 
of  two  triangles,  AABC  and  AXYZ,  is  defined  as  the  minimum  of  the  6 pos- 
sibilities of  | A— X | + | B-Y | + | C-Z | . A,  B,  C and  X,  Y,  Z are  the  angles 
of  the  two  triangles  respectively.  He  used  it  for  chromosomal  classifi- 
cation. Young  [13]  introduced  a measure  of  rate  of  angular  change. 


Sklansky  Cl  1 D suggested  a measurement  of  concavity.  This  measure- 
ment consists  of  two  parts:  (1)  the  concave  area  and  (2)  the  depth  of 
each  of  the  concave  parts.  For  a shape  of  4 concavities,  the  measure- 
ment consists  of  one  value  for  the  area  and  4 depths  for  the  4 concavi- 
ties. The  concavities  may  imply  the  overlap  of  convex  shapes  [123. 

Some  quantitative  measurements  of  shape  complexity  are  often  used 

for  biological  patterns.  Kiefer  L 1 53  used  a measurement  D to  reflect 

the  compactness.  D = 2*  t d?/A.  For  each  point  i in  a shape,  the  Eu- 

i=1  1 

clidean  distance  d^  from  the  arithmetic  mean  of  all  the  shape  points  is 
squared  and  summed.  In  other  words,  d?  = (X^-jp^  ♦ (Y.-y)^.  Rosenfeld 
[69,70],  Bacus  and  Gose  [83,  and  Green  [9]  studied  another  measurement 
of  compactness,  (Perimeter)  /area.  The  area,  spicularity  (number  of 
spicules  on  the  boundary)  and  eccentricity  (ratio  of  eigenvalues  of  the 
pixel  distribution)  were  also  used  to  classify  red  blood  cells  [533. 

For  some  particular  application,  special  features  are  used.  Kruger 
et.  al . [753  used  8 special  features  to  describe  the  size  and  shape  of  a 
heart  for  radiograph  diagnosis.  The  machine  is  told  to  find  the  cardiac 
area,  draw  a chest  midline  which  is  a vertical  line  dividing  the  thora- 
cic cavity  into  two  more  or  less  equal  areas,  and  then  measure  the  8 
features.  The  chest  midline  divides  the  cardiac  area  into  two  parts. 
The  8 features  are  the  maximum  lengths,  diagonals  and  areas  of  the  two 
parts  and  the  whole  area.  These  features  are  also  used  by  other 
researchers  [683  for  cardiac  size  and  shape  description. 

Chien  and  Fu  [683  used  the  coefficient  of  the  polynomial  which  fit 
the  boundary  (by  the  least-square  fitting  method)  to  describe  the  shape 
of  the  heart  in  a chest  x-ray  picture.  Matson  et.  al  . [103  also  used 
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the  least  square  fitting  method  to  find  the  best  fitted  ellipse  in 
studying  material  characteristics.  From  the  ellipse,  the  orientation 
and  the  aspect  ratio  of  a shape  can  be  obtained.  Aspect  ratio  is  the 
ratio  of  the  major  axis  to  the  minor  axis  of  the  ellipse. 

Among  the  features  discussed  above,  some  are  based  on  angles  and 
angular  changes  while  some  are  based  on  concavities.  Some  features  re- 
flect the  global  complexities  of  a shape.  Some  features  are  extracted 
from  the  smoothed  curve  which  fits  the  boundary.  Chien  and  Fu  [68] 
pointed  out  that  there  is  no  general  theory  for  selecting  features. 
Therefore,  the  heuristic  approach  is  usually  taken  to  select  features 
based  on  the  researchers'  knowledge  about  the  picture  patterns  and  the 
differences  between  classes  of  the  picture  patterns.  However,  there  are 
two  analytic  feature  families,  Fourier  descriptors  and  moments.  They 
will  be  discussed  in  the  following  two  sections. 

2.2.3  Fourier  Descriptor 

Fourier  transformation  has  been  used  for  preprocessing  C 1 9, 22D , 
description  and  classification  [17,18,20].  It  is  one  6f  the  most  popu- 
lar techniques  in  picture  processing  [17-20]  mainly  because  (1)  it  is  a 
complete  transformation,  and  (2)  some  values  in  the  frequency  domain  are 
invariant  with  respect  to  translation  and  rotation  in  the  time  domain. 
The  transformation  is  complete  in  the  sense  that  no  information  is  lost 
in  the  operation,  since  the  original  data  in  the  time  domain  can  be  ob- 
tained by  applying  the  inverse  transformation  to  the  frequency  domain. 
The  invariant  values  in  the  frequency  domain  can  be  used  as  features  for 
classification.  In  addition  to  these  merits,  the  Fast  Fourier 
Transform,  FFT,  reduces  the  computational  cost  a great  deal  [16,21]. 
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Fourier  transformation  can  be  applied  to  a one-dimensional  boundary 
or  to  a two-dimensional  picture.  A digital  picture  can  be  considered  as 
a two-dimensional  array.  The  two-dimensional  Fourier  transformation  is 
done  by  simply  applying  the  one-dimensional  Fourier  transformation  to 
the  array  horizontally  and  vertically  in  sequence  C22D.  Since  we  are 
concentrating  on  the  shape,  which  is  the  boundary  cf  an  object  on  the 
picture,  we  will  discuss  only  the  one-dimensional  Fourier  transformation 
applying  to  the  shape.  There  are  at  least  three  different  definitions 
of  a Fourier  descriptor: 

a.  Suppose  that  the  closed  contour  has  length  L.  ♦ (!),  defined  along 
the  contour,  is  the  net  amount  of  angular  bend  between  the  starting 
point  1=0  and  t,  0 < l £ L.  We  then  define  [183 

♦*<t>  = ♦<-£>  + t , 0 < t < 2»  . 

♦*(t)  is  invariant  with  respect  to  translation,  rotation,  and  scal- 
ing. Expanding  **(t)  in  the  Fourier  series  gives 

♦*<t>  = y0  + £ Ak(kt  - ak> 


or 


♦*(t)  = Mg  ♦ 5Z  (akcoskt  ♦ bksinkt) 

Then,  CAk,ak|k=1  •••  •}  or  {ak,bk|k=1  •••  »>  is  the  Fourier  descrip- 
tor. When  the  starting  point  changes,  the  phase  angle  of  the 
descriptors  changes,  while  the  magnitude  of  the  coefficients  remains 


invariant.  Zahn  and  Roskies  C 183  investigated  the  relationship 
between  Fourier  descriptors  and  the  geometry  of  shapes.  They  also 
studied  the  theories  for  shape  generation  from  arbitrary  Fourier 
descriptors. 

If  XU)  and  YU),  with  0 < l < L are  the  coordinate  functions  of  the 
points  on  the  closed  contour,  then  a complex  function,  U(t),  and  a 
complex  integration,  an,  can  be  defined  as  follows  C 1 73 : 

UU)  = XU)  ♦ j YU) 


Persoon  and  Fu  C 1 73  discovered  that  the  mean  square  distance  between 
two  Fourier  descriptors  equals  the  mean  square  distance  between  the 
two  corresponding  curves.  Also,  they  found  that  the  skeleton  of  a 
shape  can  be  extracted  from  Fourier  descriptors.  Granlund  C 193  ap- 
plied this  technique  to  character  recognition.  Richard  and  Hemami 
C20]  used  it  for  three-dimensional  object  identification.  Wallace 
and  Wintz  [21]  used  interpolation  techni  jes  to  obtain  a desired 
number  of  bounda  y points.  The  number  of  boundary  points  has  to  be 


a power  of  2, 
also  studied 
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transl ation. 

c.  For  simple  shapes,  there  is  another  definition  for  a Fourier 
descriptor.  Let  y(e)  be  the  distance  function  from  a contour  point 
at  angle  e to  the  centroid.  y(.o)  can  be  expanded  as  follows  C 23D 

Y<0)  = a_  ♦ Yi  <atcoske  ♦ b.sinke) 
u k=1  K 

Rutovit2  C233  found  that  the  first  six  terms  of  a^'s  and  bk's  were 
adequate  to  represent  a chromosome  for  discrimination.  But  this  de- 
finition is  restricted  to  shapes  which  can  be  represented  by  single 
valued  y(e)  functions. 


2.2.4  Moment  Method 

Method  of  moment  [24,25,26]  is  one  of  the  earliest  techniques  stu- 
died. For  a binary  digital  image,  the  moments  with  respect  to  the  cen- 
troid can  be  defined  as  follows  [63: 


_ 1 


i £ <U.-U)P<V--V)q 
N i=1  1 1 


PQ 


where  TJ  and  ¥ are  the  mean  values  of  the  im^ge  coordinates  U and  V, 
respectively,  and  the  summation  is  over  all  the  image  points  of  the 
shape.  They  can  be  only  the  contour  points  or  the  body  points.  Hu  C 251 
derived  a set  of  moment  functions  which  have  the  desired  property  of  in- 
variance under  image  translation  or  rotation.  These  functions  were 
therefore  called  moment  invariants.  Moment  invariants  were  further  in- 
vestigated for  3-dimensional  object  identification.  Dudani  C63  selected 
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7 moment  invariants  for  the  contour  and  the  body  of  the  shape  respec- 
tively. All  14  feature  values  were  used  for  classifying  aircrafts.  Liu 
[27]  combined  features  selected  from  Fourier  descriptors,  moments,  and 
geometrical  measurements  to  form  a better  feature  set  for  white  blood 
cell  classification.  Mui,  Fu,  and  Bacus  [61,62]  selected  17  features 
from  a set  of  367  features  to  classify  white  blood  cell  neutrophils  into 
band  neutrophils  and  segmented  neutrophils. 

All  the  statistical  methods  discussed  in  the  Section  2. 2. 1-2. 2. 4 
attempt  to  find  those  features  which  are  insensitive  to  rotation,  trans- 
lation, and  scaling  of  the  pattern  so  that  the  image  pattern  can  be 
represented  by  a set  of  n features.  In  other  words,  the  pattern  can  be 
represented  as  a point  in  the  n-dimensional  feature  space.  The  points 
corresponding  to  patterns  of  the  same  class  are  usually  assumed  to  be 
close  together  in  the  feature  space  while  those  of  different  classes  are 
assumed  to  not  be  close  together.  Thus,  the  feature  space  can  be  parti- 
tioned into  many  regions  corresponding  to  different  classes.  Then,  the 
classification  problem  becomes  a problem  of  obtaining  discriminant  func- 
tions which  can  well  partition  the  feature  space.  The  unknown  pattern 
whose  corresponding  point  falls  in  one  of  the  regions  is  assigned  to  the 
class  corresponding  to  that  region. 

Statistical  theories  [2]  prove  that  the  Bayesian  classifier,  the 
maximum  likelihood  classifier,  is  the  minimum  error  classifier.  If  the 
distribution  of  the  patterns  in  the  feature  space  is  parametrical,  the 
classification  becomes  easier  after  the  parameters  of  the  distribution 
are  obtained.  Some  applications  assuming  Gaussian  distribution  achieved 
reasonably  good  results.  The  learning  techniques,  or  estimation,  of 
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Gaussian  parameters  can  be  found  in  C13.  Unfortunately,  the  assumption 
of  parametrical  distribution  lacks  theoretical  foundations.  The  assump- 
tion may  not  be  true,  if  a large  dimensionality  pattern  vector  is  re- 
quired to  describe  very  complex  shapes.  The  situation  may  be  worse,  if 
the  complex  shapes  do  not  differ  significantly  for  different  classes. 

People  studied  the  discriminant  functions  for  non-pa rametric  dis- 
tributions Cl, 2, 33.  Linear,  piecewise  linear,  or  nonlinear  discriminant 
functions  can  be  obtained  iteratively  from  the  training  patterns.  But 
the  learning  procedure  is  time  consuming.  The  training  time  increases 
rapidly  with  the  complexity  of  the  function  and  with  the  number  of 
training  samples.  The  K-nearest-neighbor  method  is  often  used  to  solve 
the  classification  problem  of  non-parametrical  distributions.  All  the 
known  vector  patterns  are  stored  in  the  computer.  For  an  unknown  vector 
pattern,  the  computer  finds  the  K nearest  known  patterns  in  its  neigh- 
borhood and  then  assigns  the  unknown  to  the  majority  class  of  these  K 
nearest  neighbors.  Although  this  method  does  not  require  training  time, 
the  classification  time  and  storage  for  known  patterns  increase  with  K 
and  the  number  of  known  patterns.  In  fact,  the  K-nearest-neighbor 
method  is  a more  intelligent  template  matching  technique.  It  can  also 
be  considered  as  a combination  of  the  statistical  method  and  template 
matching  method. 

^.3^  Syntactic  Method 

The  structural  property  of  a contour  is  usually  a deciding  factor 
in  recognition.  In  the  statistical  method,  some  features  are  so 
designed  that  they  will  implicitly  reflect  the  global  structure  of  the 
shape  by  numerical  values.  For  example,  the  Fourier  descriptor  can  ex- 
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press  the  global  structure  in  terms  of  values  in  the  frequency  domain. 


The  syntactic  method  [4,53  can  utilize  the  production  rules  to  explicit- 


ly describe  the  structure  of  the  contour,  if  appropriate  primitives  can 


be  found.  Generally  speaking,  syntactic  pattern  recognition  consists  of 


three  steps:  preprocessing,  primitive  extraction,  and  syntax  analysis. 


Figure  2.3  is  a diagram  of  the  system. 


For  shape  recognition  problems,  we  can  illustrate  the  syntactic 


method  in  terms  of  an  example  of  chromosome  recognition  C53.  See  Figure 


2.4.  Let  us  suppose  the  smooth  boundary  of  a submedian  chromosome  is 


obtained  after  the  preprocessing  stage.  The  machine  is  given  the 


specifications  of  the  primitives  a,  b,  c,  d.  The  specifications  may  be 


numerical  attributes  such  as  length,  angle,  radii,  etc.  The  computer 


processes  the  boundary  to  extract  the  primitives  which  fit  the  specifi- 


cations and  represents  the  boundary  in  terms  of  a concatenated  string  of 


primitives.  Thus,  the  pattern  is  treated  as  a sentence  or  a string. 


Then,  the  machine  analyzes  the  syntax  of  the  string  according  to  a given 


grammar.  We  assume  that  there  is  a grammar  for  each  class.  The  grammar 


can  generate  all  the  possible  patterns  in  the  class.  A syntax  analyzer 


is  called  a parser.  A parser  is  used  to  find  a derivation  which  derives 


the  string  through  the  use  of  the  production  rules.  The  unknown  pattern 


is  accepted  by  a class,  if  a derivation  tree  is  found  with  respect  to 


the  corresponding  grammar.  Because  of  tKc  capability  of  structural 


description,  the  syntactic  approach  has  received  more  and  more  attention 


in  recent  years  [28, 29,573. 


In  addition  to  chromosome  recognition  L6, 353,  syntactic  methods 


have  many  applications  in  picture  recognition.  Shaw  [36,37]  developed 
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Figure  2.4  A Syntactic  Chromosome  Recognition  System 
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Picture  Description  Language.  Narasimham  [383  proposed  a grammar  for 
hand-printed  FORTRAN  characters.  A number  of  authors  have  used  the  syn- 
tactic approach  to  generate  or  recognize  Chinese  characters  [53. 
Pavlidis  used  some  simple  primitives  to  describe  and  recognize  hand- 
written numerals  [283. 

Because  of  the  noise  on  the  boundary  and  the  fuzzy  situation  in  ex- 
tracting the  primitives,  the  pattern  representation  can  sometimes  be  in- 
correct. To  handle  these  incorrect  representations,  grammars  are 
designed  to  generate  the  noisy  and  distorted  patterns  under  considera- 
tion. Thus,  an  ambiguous  situation  develops  when  one  pattern  can  be 
generated  by  more  than  one  grammar.  Stochastic  languages  can  resolve 
the  ambiguity  on  the  basis  of  probabilities  [39,40,413.  Lee  and  Fu  [583 
studied  the  inference  of  the  probability  assignments  for  a stochastic 
context-free  language  and  its  application  to  chromosome  identification. 
Recently,  the  study  of  error-correcting  techniques  [30,31,323  has  ren- 
dered the  syntactic  method  capable  of  processing  partially  corrupted 
patterns.  An  error-correcting  parser  can  find  a derivation  tree  which 
derives  a pattern  different  from  the  unknown  pattern  with  a minimum  num- 
ber of  errors.  The  number  of  errors  is  considered  to  be  the  distance 
between  the  unknown  and  the  class,  or  the  language,  generated  by  the 
grammar.  Classification  is  accomplished  by  assigning  the  unknown  to  the 
closest  class.  And,  the  syntactic  clustering  by  distance  criterion  is 
therefore  possible  [32,843. 


pears.  This  kind  of  distortion  in  the  picture  may  demage  the  Fourier 


descriptors  or  moment  invariants  completely  because  the  distortion  may 
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J2.4  Conclusion 

In  the  previous  sections  of  this  chapter,  most  of  the  existing 
shape  recognition  methods  were  discussed.  The  primitive  and  restricted 
prototype  matching  technique  is  useful  in  recognizing  machine-printed 
characters.  But  it  does  not  satisfy  the  needs  of  a higher  intelligent 
recognizer.  The  more  sophisticated  skeleton  matching  method  can  be  used 
to  recognize  noise-free  and  structurally  different  images.  It  may  not 
work  with  the  shapes  which  differ  only  in  some  local  details  for  dif- 
ferent classes.  The  statistical  method  is  the  most  popular  technique 
used.  This  approach  is  effective  under  two  conditions:  (1)  when  we  can 
select  appropriate  features  which  are  sensitive  to  the  class  differences 
and  insensitive  to  noise,  and  <2)  when  the  discriminant  surfaces  in  the 
feature  space  can  be  easily  and  correctly  obtained.  Unfortunately,  the 
features  are  usually  difficult  to  select  and  are  dependent  upon  particu- 
lar applications.  The  pattern  distributions  are  generally  non- 
parametric.  This  increases  the  difficulty  of  finding  the  discriminant 
surfaces.  Perhaps,  the  Fourier  descriptors  and  moment  invariants  are 
the  best  features  for  rigid-body  objects.  But  they  are  not  so  effective 
when  noise  is  present.  We  can  categorize  noise  into  two  kinds  by  its 
nature:  random  noise  and  non-random  distortion.  Random  noise  has  been 
studied  for  decades.  In  imagery  data,  it  can  be  taken  care  of  by  using 


digital  filtering  techniques  C22,903  in  the  preprocessing  stage.  For 
detecting  shape  boundary,  many  techniques  have  been  developed  C64-673. 
Non-random  distortion  may  appear  in  any  application.  For  example,  an 
airplane  may  be  partially  covered  by  clouds  or  an  automobile  may  be 
behind  a tree.  Some  aspect  of  the  shape  is  badly  distorted  or  disap- 
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pears.  This  kind  of  distortion  in  the  picture  may  demage  the  Fourier 
descriptors  or  moment  invariants  completely  because  the  distortion  may 
not  only  affect  some  boundary  details  but  may  also  change  the  overall 
shape  structure.  Humans  can  still  recognize  the  object  by  seeing  only 
the  unaltered  part  of  the  shape. 

It  appears  that  the  syntactic  method  is  the  most  promising  method, 
since  it  uses  production  rules  to  explicitly  describe  the  structure  of  a 
shape  and  primitives  to  describe  the  local  boundary  details.  For  a par- 
tially demaged  shape,  error-correcting  techniques  can  be  used  to  recog- 
nize the  object  by  the  unaltered  part  of  the  shape  and  to  potentially 
correct  the  distorted  part  of  the  shape. 
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CHAPTER  3 

SHAPE  DESCRIPTION  AND  RECOGNITION  USING  ATTRIBUTED  GRAMMARS 

Introduction 

This  chapter  contains  the  main  portion  of  our  proposed  method.  The 
syntactic  method  can  explicitly  describe  the  global  structure  by  grammar 
rules,  and  the  local  boundary  details  by  primitives  with  numerical  at- 
tributes or  features.  These  associated  attributes  carry  the  semantic 
information  of  the  primitives.  As  indicated  in  the  previous  chapters, 
the  proposed  method  attempts  to  solve  general  shape  recognition  prob- 
lems. In  other  words,  we  need  to  develop  a method  which  can  describe 
arbitrary  shapes,  and  a technique  which  can  recognize  the  shapes  from 
their  descriptions. 

In  order  to  avoid  the  need  for  a context-sensitive  grammar,  ue  try 
to  use  more  sophisticated  primitives.  To  describe  an  arbitrary  shape, 
we  need  to  consider  a large  number  of  possible  primitives,  although  we 
may  not  need  all  of  them  for  one  shape.  To  extract  the  primitive  from  a 
shape,  the  machine  must  know  the  specifications  of  every  primitive  used. 
It  is  obvious  that  we  need  a systematic  way  to  describe  the  large  number 
of  possible  primitives.  Though  the  number  of  primitiv  s used  in  any 
particular  applications  is  finite,  the  number  of  possible  primitives  for 
arbitrary  shapes  may  be  very  large  unless  we  restrict  ourselves  to  spe- 
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which  uses  four  features  to  describe  an  arbitrary  curve.  The  four 
features  range  continuously  on  the  real  line.  That  is,  there  are  an  in- 
finite number  of  feature  values.  The  additive  property  of  the  primitive 
descriptors  was  discovered  and  will  be  discussed.  In  Section  3.3,  the 
exact  and  efficient  equations  for  computing  features  in  discrete  cases 
are  derived  by  applying  the  additive  property.  In  Section  3.4,  shape 
grammars  are  explained  and  illustrated  by  examples.  In  this  section  we 
can  see  the  advantage  of  using  our  more  sophisticated  primitives,  i.e., 
the  simplicity  of  the  shape  grammars. 

Following  the  descriptive  method,  we  will  investigate  the  recogni- 
tion techniques  in  subsequent  sections.  In  Section  3.5,  we  first  intro- 
duce a transformation  which  maps  the  four  features  in  a descriptor  to 
another  set  of  four  variables.  These  four  variables  constitute  a 
transformed  descriptor  which  is  theoretically  invariant  with  respect  to 
translation,  rotation,  and  scaling.  Then  we  will  study  the  effect  of 
digitization  noise  on  the  transformed  descriptor.  In  the  noise  study, 
we  could  construct  a recognition  function  of  primitives.  But,  many 
confusing  situations  may  occur  when  extracting  primitives  due  to  noise 
or  similarity  of  different  parts.  These  sources  of  confusion  can  be 
resolved  by  using  the  context  information  which  is  used  in  parsing.  In 
Section  3.6,  we  introduce  the  concept  of  primitive-extraction-embedding 
and  develop  two  parsing  algorithms  which  embed  the  primitive  extraction. 
Because  the  definition  of  primitive  is  flexible  and  the  specification  of 
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primitive  is  not  unique,  a classification  tree  scheme  is  possible  for  a 
multi-class  problem.  The  feasibility  of  this  is  discussed  in  Section 
3.7.  This  chapter  concludes  with  a discussion. 

3.2!  Primitives  and  Attributes 

Section  2.3  describes  syntactic  recognition  with  a chromosome  exam- 
ple. It  is  not  difficult  to  see  that  the  four  primitives  selected  for 
the  submedian  chromosome  are  insufficient  for  describing  an  arbitrary 
shape.  In  applying  the  syntactic  method  to  an  arbitrary  shape  problem, 
we  encountered  three  problems:  (1)  what  are  the  appropriate  primitives? 
(2)  how  do  we  specify  the  primitives?  and  (3)  how  do  we  describe  the 
shape  by  production  rules  with  these  primitives? 

The  meaning  of  a primitive  may  lead  us  to  use  the  simplest  elements 
of  the  shape  as  primitives.  For  example,  unit  vectors,  pixels.  Freeman 
chain  codes  [4 2D,  etc.,  can  be  used  to  describe  the  shape.  But,  the 
simpler  the  primitives  are,  the  more  complex  the  grammars  must  be.  Con- 
sequently, more  complicated  parsing  procedure  and  more  computation  time 
are  required.  If  fixed  length  curve  segments  are  used  as  primitives, 
e.g.,  unit  vectors,  we  may  need  context-sensitive  grammars  to  handle  the 
scaling  problem.  The  parsing  of  context-sensitive  grammars  is  difficult 
and  should  be  avoided  whenever  possible.  Therefore,  we  propose  to  use 
more  sophisticated  primitives  and  simpler  grammars  to  tackle  shape 
description  and  recognition  problems. 

Since  a curve  can  be  characterized  by  its  curvature  function  C927, 
we  have  the  following  definition.  Note,  the  curvature  faction  f(i)  of 
a curve  segment  is  the  derivative  of  the  direction  along  the  curve  seg- 
ment with  respect  to  length.  That  is. 


- 30 


f(l)  = lira 
At*0 


^ /angle  between  the  tangent  lines  to  the 


At 


\ curve  segment  at  l-^Al  and  t+^At 


) 


Definition  3.1;  A curve  segment,  X.^,  is  a directional  line  with  a 
starting  point  and  an  ending  point  X£.  The  curve  segment  has  a cur- 
vature function,  f(t),  along  the  direction  with  0 < t <L,  where  L is  the 
total  length  of  the  curve  segment. 

To  simplify  our  analysis,  we  define  a simple  curve  segment  as  fol- 


lows. 

Definition  3.2:  A simple  curve  segment  is  a curve  segment  with  either 
f(l)  0,  or  f(l)  _<  0,  for  0 < t <L.  See  Figure  3.1  for  illustrations. 

The  curvature  function  of  a smooth  curve  is  a continuous  function 
in  a continuous  case,  but  it  is  a summation  of  pulses  in  a discrete 
case.  A curve  segment  is  clockwise  if  f(D  0,  and  counter-clockwise 
if  f(l)  < 0. 


To  allow  many  possible  simple  curve  segments,  we  use  a finite  num- 
ber of  feature  values  to  characterize  a simple  curve  segment.  Intui- 
tively, we  would  use  the  length  and  the  angular  change  of  the  curve  seg- 
ment as  features.  But  sometimes,  we  would  like  to  distinguish  between 
two  curves  with  the  same  length  and  angular  change  but  different  lengths 
of  the  vector  which  connects  the  two  ends  of  the  curve.  See  Figure  3.2 
for  an  il lustration. 

In  addition  to  the  above  three  features,  we  need  a measure  of  sym- 
metry to  resolve  some  ambiguities.  For  example,  the  shape  of  a heart  is 
decomposed  into  two  parts.  (See  Figure  3.3).  The  two  parts  are  mirror 
images  of  each  other.  If  the  two  parts  are  rotated  and  aligned  with  the 
vectors  as  in  Figure  3.3(c),  we  can  see  their  different  declinations.  A 
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Figure  3.1  Two  Simple  Curve  Segments 


Figure  3.2  Two  Curves  Have  The  Same  Curve  Length, 
Same  Angle  But  Different  Vector  Lengths 


Figure  3.3  Decomposition  of  a Heart 
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symmetric  measurement,  S,  is  needed  to  reflect  the  declination  of  a 
curve.  We  found  that  these  four  features  can  sufficiently  characterize 
a simple  curve  segment.  The  features  are  called  attributes. 

Definition  3.3:  The  C-descriptor  of  a curve  segment  p = X.^  has  four 
attributes,  t,  L,  A,  and  S,  i.e. 

DCp)  = <t,  L,  A,  S). 


Where  t = X.^, 


f (l)dt 


/ 


and 


Si(-C  f ( Ir  ) d 1 “ -j)  d s • 


t is  the  vector  pointing  from  to  X^,  L is  the  total  length  of  the 
curve,  A is  the  total  angular  change,  and  S measures  the  symmetry.  Fig- 
ure 3.4  illustrates  the  function  of  S.  When  p is  symmetric,  S = 0.  If 
p is  not  symmetric,  then  S > 0 when  p is  declined  to  the  left,  and  S < 0 
when  p is  declined  to  the  right.  Somehow,  S measures  the  degree  of  de- 
clination. But,  the  S measurement  becomes  less  meaningful  when  the 
curve  segment  is  not  simple.  The  four  attributes  do  not  uniquely  define 
a curve  segment  unless  more  restrictions  are  added.  For  example,  if  S = 
0,  A = 0,  L = \t\,  the  curve  segment  is  a straight  line  vector,  t. 

To  maintain  flexibility  in  primitive  definition  and  hence  to  sim- 
plify the  shape  grammar,  we  do  not  restrict  curve  primitives  to  only 
simple  curve  segments,  although  the  primitives  are  essentially  simple 
curve  segments.  Thus,  a curve  primitive  consists  of  one  or  more  simple 
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curve  segments. 

In  addition  to  the  curve  primitives,  we  need  a relation  to  describe 
the  connection  between  curve  segments.  This  relation  has  one  attribute 
which  describes  the  connection  angle  between  two  consecutive  curve  seg- 
ments, and  will  be  referred  to  in  following  paragraphs  as  an  angle  prim- 

itive. 

Definition  3.4:  The  A-descriptor  of  an  angle  primitive,  a,  has  only  one 
attribute,  D(a)  = A,  which  specifies  the  angular  change  at  the  conca- 
tenating point  of  two  consecutive  curve  segments. 

Example  3.1:  If  a curve  segment  N = X.,x2  is  broken  at  point  X3,  we  may 
define  curve  segments  X^Xj  and  X^X2  correspondingly  as  curve  primitives 
P^  and  p2,  with  an  angle  primitive,  a,  between  them.  (See  Figure  3.5). 
Their  descriptors  are: 

D(p^)  - (t^,  L.j , A^,  $.,)  , t.j  = 

d(p2>  = <t2,  i_2,  a2,  $2)  , t2  = x3x2 

D(a)  = a 

D(N)  - (^|y,  Ajg,  Sjg)  , — X^2 

Definition  3.5:  If  a curve  segment  N is  broken  into  two  curve  segments, 
N.j  and  N 2,  with  a connection  angle  primitive  a,  then  there  is  a produc- 
tion rule,  N ♦ N^aN2> 

4 4 

Their  descriptors  have  an  interesting  additive  property. 


\ 
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Theorem  A:  Additivity. 

If  N -►  N1aN2  with  descriptors  D(N)  = (t,  L,  A,S),  = (t^  Lj,  Aj, 

S.|),  D(N2>  = (£2*  L2r  *2'  ^2*  anc*  &(a)  = a t^,en 

t 


L = L1  + l2. 


A-A^  + a + A2, 


and 


S = S1  + S2  + -j-j 


?[(aih 


(A2+a)L 


l] 


Proof:  The  first  three  equations  are  obvious.  The  fourth  equation  is 
proved  as  follows.  From  the  definition  of  S,  we  have 


S1  = 


,y  s 

l * 


f 1 (t)  dt)  ds  - ^ A1L1 


"2  "■£  % ' 2‘ 


7 «2l2 


L,+L_ 

12s 


f (jT  f(l)  dt)  ds  - j (A1+a+A2)(L1+L2> 


where  f f 2,  and  f are  curvature  functions  of  and  N,  respec- 

tively, and 


T, 
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f<t>  = <a« 


0 < l < L, 


l = L« 


f^l-L,)  L1  < t < 


L1+L2  s 


S =/  (£  f (i)di)ds  + J <£  f (i)dt)ds  - \ (A1+a+A2)(L1+L2) 


L1+L2  s 


1 + J (-J  f 2(t)dl)  + A1  + a ds 


Li  L1 


(L^ A2+L^a+L2A^+L2A2+L2a) 


•••  l i 


f(t)dt)ds  = 0/  Sincej||  f(l)dl  is  finite. 


After  substituting  into  S,  we  obtain 


S = S1  + S2  + C(A^+a)L2  - L1 (A2+a)] 


Definition  3.6:  D(N^)  ($)  D(N2>  denotes  the  above  additivity. 
Coroltary  A.1;  If  N - N^aN2  then  D(N)  = DIN.,)  f D(N2>. 


Q.E.D. 
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Theorem  B:  The  addition,  is  associative. 
If  N * N^a^N2a2Nj,  then 


D(N)  = DCN.,)  <§  D(N2)  §>  D(N3) 


= CD(N1 ) $j>  D(N2)]  <§>  D(N3) 


= D(N1)  £ [D(N2>  $ D(N3)] 


The  proof  is  obvious. 


3. 3^  Computation  of  C-Descr iptors  in  £ Pi  screte  Case 


The  above  C-descriptors  are  defined  in  a continuous  case,  but  they 
can  be  computed  efficiently  in  a discrete  case  by  using  only  addition 
and  multiplication.  We  first  derive  two  more  corollaries  from  Theorem 


Corot lary  A. 2:  If  N^,  in  Theorem  A,  is  a straight  line  vector,  i.e. 
A2  = 0,  S2  = 0,  then 

t = J,  . t2  , L - L,  ♦ Lj  , 

and  S = S1  + AL2  + -j-  A.,1^  - ^ AL 


Corol l ary  A. 3:  If  is  also  a straight  line  vector,  i.e. 
S.j  = 0,  then 

t = + ^2  * L = + l_2  , A = a,  and 

S = <*l*2  ■ ^ a(L^  + L^)  = AL2  * 2 


A,  = 0, 


, ...  ~~rrr 
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Theorem  C;  Recursive  Equations  for  C-Descriptors 
M = (v^vj  •**  vm>  is  a vector  chain.  Let  denote  (v^  •••  v^), 
Mm  = M'  ai  is  the  angle  between  v^  and  vi+1,  DCv^)  = (t^,  Li# 
i < i < m.  Then  D(M^)  = (tMj , , A^j , SMj),  1 £ j < m,  where 

tMj  = ^M(j-l)  + = ^ Ci  ' 

LMj  = LM(j-1)  + Lj  = ^ Li 

AMj  = AM( j~1 ) + aj-1  = 3i  ' 

1 

SMj  = GMj  "7  AMj  LMj 

G„.  = G,,,.  ..  + AM.  L.  = 4]  Am.L.  = f T a L. 

M]  M<]-1)  Mj  ] Ni  i ,Tl  t=1  1 1 

Proof : 

(1)  When  j = 1/  it  is  obvious 

DCM1 ) = (tv  L,,  0,  0) 

(2)  When  j = 2,  according  to  Corollary  A. 3 


■ ■ 


s 


i .e . 
0,  0), 


D(M2>  " (^M2'LM2'AH2'SM2> 

{M2  = *1  + ^2  = £ 

1=1 


Li 


AM2  = a1 

SM2  = GM2  " 7 a1<L1+L2)  = GM2  " 7 AM2LM2 

and 


GM2  ~ a1L2 


(3)  When  j = 3,  according  to  Corollary  A. 2,  we  have 


D(M3)  = (^M3'LM3'AM3'SM3) 
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By  mathematical  induction,  the  theorem  is  proven  true  for  any  m. 

Q.E.D. 

Theorem  C contains  the  computation  equations  in  a recursive  form. 
The  recursive  relationships  are  reversible. 

Theorem  D:  Reverse  Relationships 

M = (v.v_  ...  v ) is  a vector  chain.  Let  M.  denote  (v.  ...  v.),  (i.e., 

i c m l • ii 

M = M) . a.  is  the  angle  between  v.  and  v...,  0(v.)  = (t-,L. ,0,0) 

R1  I 1 1 ▼!  1 1 1 

1 £ i < m.  Then  D(H^)  can  be  computed  from  D(Mj+.j)  and  D(vj+^), 
1 < j _<  m 

t - t _ £ 

'•Mj  L(Mj+1)  Lj+1 

LMj  = LM( j+1 ) " Lj+1 

% = AM< j+1 ) " aj 

SMj  = GMj  "7  AMj  LMj 

GMj  = GM( j+1 ) " AM(j+1 >Lj+1 
The  proof  is  obvious. 

With  Theorem  C,  the  attributes  can  be  computed  exactly  instead  of 
approximately  in  a discrete  case.  In  addition.  Theorem  C suggests  two 
ways  of  computing  the  C-descriptors.  For  a boundary  chain  of  m vectors, 
if  there  is  enough  memory  to  store  all  the  descriptors  calculated  for 
later  processing,  we  need  to  compute  at  most  m(m+1)/2  possible  C- 
descriptors.  The  recursive  equations  in  Theorem  C suggest  that  each  C- 
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descriptor  requires  2 multipl ications,  5 additions,  and  1 shift.  This 
implies  that  it  takes  about  m^  multiplications  to  compute  all  the  possi- 
ble C-descriptors.  But  the  equations  of  total  summation  in  Theorem  C 
suggests  that  attributes  of  any  curve  segment  can  be  computed  directly 
from  the  lengths  and  the  connecting  angles  of  the  vectors  of  the 
corresponding  chain.  That  implies  another  possible  implementation.  No 
memory  for  storing  C-descriptors  is  necessary.  The  C-descriptor  of  any 
nonterminal  or  curve  primitive  can  be  obtained  directly  from  the  boun- 
dary chain,  as  long  as  the  corresponding  indices  of  the  vector  subchain 
are  known.  So,  Theorem  C suggests  two  different  implementations  for  ob- 
taining C-descriptors.  There  is  a trade-off  between  the  two  implementa- 
tions with  regards  to  time  and  memory.  In  our  experiment,  we  use  the 
second  implementation  in  order  to  have  better  control  of  the  memory 
space. 


3._4  Shape  Grammars 

Although  a picture  is  two-dimensional,  the  outer  boundary  of  an  ob- 
ject in  the  picture  is  one  dimensional.  Thus,  a one-dimensional  string 
grammar  is  sufficient  to  describe  shapes.  We  proposed  to  use  the  curve 
segments  as  curve  primitives  and  connection  angles  as  angle  primitives 
in  previous  sections.  If  a shape  is  decomposed  into  proper  curve  seg- 
ments and  each  curve  segment  is  coded  as  a primitive,  then  a shape  can 
be  represented  as  a string  of  primitives,  or  a sentence.  Though  a shape 
is  a closed  curve,  we  can  break  it  at  some  arbitrary  point  so  that  it 
can  be  described  by  j curve  segment  with  the  same  starting  and  ending 

point  and  an  angle  primitive  which  specifies  the  angular  change  at  that 

» 

point.  That  arbitrary  point  can  be  the  first  point  found  in  tracing  the 
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boundary. 

Because  of  the  attributes  associated  with  the  primitives  and  their 
additivity,  we  propose  to  use  attributed  grammar  [44]  for  shape  descrip- 
tion and  recognition. 

Definition  3.7;  An  attributed  grammar  is  a grammar  where  (1)  each  primi- 
tive or  nonterminal  has  a symbol  part  and  a value  part  which  may  have 
several  values  called  attributes,  and  (2)  each  symbolic  production  rule 
has  a corresponding  set  of  attribute  rules  which  process  the  attributes 
related  to  the  production  rule  in  parsing. 

Let  us  illustrate  the  attributed  shape  grammar  with  an  airplane 
shape. 

Example  3.2:  A simple  airplane  shape  (see  Figure  3.6)  can  be  decomposed 
into  four  parts:  nose,  left  wing,  right  wing,  and  fuselage  and  tail, 
denoted  as  N^,  N^,  N^,  and  respectively.  is  further  decomposed 

into  three  parts:  left  side,  tail,  and  right  side,  denoted  as  p^,  N,., 
and  p^q  respectively.  The  nonterminals  N^,  N^,  N^,  and  can  be  furth- 
er decomposed  into  primitives.  For  each  decomposition,  there  is  a sym- 
bolic production  rule  and  a set  of  attribute  rules.  The  angle  primi- 
tives, a.-'s,  describe  the  connection  angles,  a labels  the  pattern.  The 
shape  grammar  is  as  follows 
G = <V,T,P,Sa) 

V = <S  ,N.  |1  < K < 5> 
a k — — 

T = Ca^'S/Pj 's|1  < i < 7,  1 < j < 12 
P: 

Sa  * Nla2N2a5N4a5N3a2  ^nswerc*  / c ♦ a 

N1  ♦ P2a-|Pi  f D(N.j)  ♦ D(p2)  © IXp^) 

a1 


— 


-It  6 


| I 

I | 

|< 

ij  I 


0(N2) 

* 0(p,)  © 

D(Pc)  © 

0(P7) 

a3 

a4 

N3  * p8a4p6a3p4  * 

D(Nj) 

♦ D(Pg)  © 

0(pA)  ® 

6 » 

d<p4) 

a4  a3 

N4  * p9a6N5a6p10  ' D(N4)  "■  D(p9>  ® D(N5)  ® 0<p10> 

a6  a6 

Nj  ♦ Pf i a7Pi 3a7p1 2 ' D(Nj)  ♦ D(p^)  0 P^p13)  ® ^p12^ 

a7  a7 

A general  form  of  the  attributed  shape  grammar  is 
G = <V  * T,*  p,/  s.>  *^ere  S is  the  starting  symbol  with  a special  at- 
tribute  t,  which  is  the  label  of  the  pattern. 

\ = «»'  N's> 


= {F's,  A's  | F:  curve  primitive.  A:  angle  primitive} 

P.:  S.  + (XA)*XA{Answer  > ; c * l 

it  c 


N - (XA)*X  ; D(N)  «-  (D(X)  <&*D(X) 

A 


where  X e {N's,F's> 

For  each  S-production  rule,  {Answerc>  and  c * l mean  that  if  the  parsing 
is  successful,  the  shape  pattern  is  recognized  as  being  in  the  class  la- 
beled by  c.  The  idea  of  {Answerc>,  an  action  symbol,  is  borrowed  from 
the  translational  grammar  C43,443.  Because  of  Theorems  A and  B,  the  at- 
tribute rules  for  each  N-production  rule  can  be  obtained  easily  from  the 
symbolic  rule.  Therefore,  the  attribute  rules  are  omitted  in  the  fol- 
lowing examples. 


•] 
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Example  3^:  The  shape  grammar  for  airplane  BAC  111, 

Gc  = <V  V V V 


Vc  = <Sc'Nci  I 1 < 1 < 8> 


Tc  = <Fcjr  Ack  I 1 < j < 15,  1 < k < 7> 


Pc:  (The  corresponding  segmentations  are  shown  in  Figures  3.7  and  3.8.) 


<’>  V Nc1*c1NcJ»c2Fc1Ac2"c3‘«l"ct,c3 


(2)  S*  N ,A  ,F  ,A  ,N  -A  ,N  .A  ,N  ,A  , 
c c2  c2  cl  c2  c3  cl  c4  c3  cl  cl 


<3)  V Fc1Ac2Nc3Ac1Nc4Ac3Nc1Ac1Nc2Ac2 


(4)  S + N ,A  ,N  .A  ,N  .A  .N  ,A  ,F  .A  , 
c c3  cl  c4  c3  cl  cl  c2  c2  cl  c2 


(5)  S ♦ N .A  ,N  .A  . N ,A  ,F  .A  ,N  -A  , 
c c4  c3  cl  cl  c2  c2  cl  c2  c3  cl 


(6)  S - N .A  _F  .A  ,N  7A  .N  0A  _N  CA  . 
c c6  c2  cl  c2  c7  c4  c8  c3  c5  c4 


(7)  S ♦ N nA  ,N  CA  .N  .A  ,F  ,A  -N  ,A  . 
c c8  c3  c5  c4  c6  c2  cl  c2  c7  c4 


00 

Ncr 

F ,A  CF 
c2  c5  c3 

(9) 

Nc5* 

Fc2Ac5Fc4 

(10) 

Nc2* 

Fc5Ac6Fc6Ac7F 

c 7 

(11) 

N z- 
c6 

F 0A  ,F  ,A  7F 
c8  c6  c6  c7 

c7 

(12) 

N T> 
c3 

Fc9Ac7Fc10Ac6 

Fc11 

(13) 

N 

c7 

Fc9Ac7Fc10Ac6 

Fc1 2 
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(14)  N 


c4">  Fc13Ac5Fc14 


(15)  N 


c8- 


Fc15Ac5Fc14 


In  Example  3.3/  the  S-production  rules  cover  the  most  probable  starting 
points  of  the  boundary.  Due  to  the  rotation  of  the  object/  the  starting 
point  may  be  any  of  the  convex  points.  Instead  of  looking  through  the 
whole  boundary  chain  for  a fixed  starting  point/  we  use  the  S-production 
rules  to  take  care  of  the  most  probable  starting  points  so  that  we  only 
need  to  look  over  a short  portion  of  the  boundary  chain  for  a sharp  con- 
vex point/  that  would  serve  as  the  starting  point.  Because  of  the 
noise,  sometimes  the  breaking  points  can  not  be  found  in  extracting 
primitives.  For  instance,  say  the  corner  of  angle  primitive  A^  in  Fig- 
ure 3.7  is  smeared  so  that  F^  and  Ffj  cannot  be  extracted.  We  cart 
avoid  this  trouble  by  finding  Ac^,  see  Figure  3.8,  to  extract  F£g  and 
F^.  With  this  idea,  the  noise  problem  at  the  breaking  points  can  be 
taken  care  of  by  employing  different  segmentations.  This  example  has 
essentially  two  sets  of  segmentation  as  seen  in  Figures  3.7  and  3.8. 
The  non-simple  curve  segment,  F^,  and  F^g  are  used  as  primitives  to 
reduce  the  number  of  primitives,  and  consequently  reduce  the  difficul- 
ties of  extracting  primitives  from  the  noisy  shapes.  The  assignment  of 
primitives  is  very  flexible. 


3. 5_  Recognition  of  Primitives 

The  first  three  problems  mentioned  in  Section  3.2  are  relevant  to 
the  description.  They  are  discussed  in  Sections  3.2,  3.3,  and  3.4.  The 
other  two  problems  relevant  to  the  recognition  are  (1)  primitive  extrac- 
tion, and  (2)  shape  recognition  by  parsing. 


\ 


it  r 


As  mentioned  previous! y,  a curve  segment  can  be  described  by  four 
attributes.  But  the  translation,  scaling,  and  rotation  of  the  image  may 

l 

create  different  values  for  the  attributes,  and  also  introduce  different 
noise  in  digitization.  Fortunately,  the  attributes  can  be  transformed 
into  a multi-dimensional  space,  in  which  the  transformed  descriptors  are 
theoretically  invariant  under  the  above  operations,  if  it  is  noise-free. 
Definition  3.8:  Transformation  T:D(p)  ♦ T(p),  or 

T:(t,  L,  A,  S)  ♦ <£,  A,  £), 

0 

where  C = |t|,  Lq  is  a scale  normalization  factor,  which  could  be  the 
total  length  of  the  shape  pattern. 

Theorem  E:  The  C-descriptor  transformed  by  T,  T(p),  is  invariant  with 
respect  to  translation,  rotation,  and  scaling  of  the  image. 

Proof:  Let  us  consider  an  analog  image.  The  shape  does  not  contain  any 
digitization  noise.  It  is  easy  to  see  that  the  translation  does  not  af- 
fect the  attributes,  t changes  with  scaling  and  rotation.  L,  S,  and  Lq 
change  proportionally  with  scaling  only.  Therefore,  the  divisions  elim- 
inate all  the  scaling  factors,  and  the  absolute  value  of  t is  invariant 
with  respect  to  rotation. 

Q.E.D. 

Because  of  the  invariant  property,  the  recognition  of  the  primi- 
tives, and  hence  the  whole  shape,  is  based  on  the  transformed  descrip- 
tors. If  two  curve  segments  are  mirror  images  of  each  other,  the*r 
transformed  descriptors  differ  only  in  the  sign  of  S/L.  Consequently, 


if  it  is  necessary,  the  storage  for  transformed  descriptors  of  a bisym- 
metric  shape,  e.g.  top  view  of  airplanes,  can  be  reduced  in  half  by 
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storing  them  in  pairs. 

To  recognize  a primitive  in  the  boundary,  we  simply  rely  on  the 
similarity  between  the  transformed  descriptors.  But  digitization  intro- 
duces different  noise  to  the  descriptors  with  respect  to  any  operation 
of  translation,  rotation,  and  scaling.  Normally,  if  the  digitization 
resolution  is  fine  enough,  the  digitization  noise  introduced  by  rotation 
is  much  more  significant  than  that  due  to  scaling  and  translation,  be- 
cause rotation  influences  the  angle  feature.  A,  significantly.  Although 
we  can  apply  some  smoothing  techniques  to  the  boundary  vector  chain,  the 
boundary  cannot  completely  be  recovered  in  all  cases.  We  need  to  study 
the  possible  distribution  of  the  feature  values  under  various  rotations 
to  help  construct  a proper  similarity  measurement. 

In  the  following  paragraphs,  we  will  concentrate  on  the  noise  ef- 
fect of  rotation  on  the  shape  information  of  a curve  segment.  The  shape 
information  is  characterized  by  (C/L,  A,  S/L),  while  L/Lq  represents  the 
size  of  the  curve  segment  in  proportion  to  the  total  length. 

For  convenience,  a normalization  function  is  defined  as  follows: 
Definition  3.9:  A normal ization  function,  N,  of  the  curve  primitive 
descriptor  is: 

N:  (t,  L,  A,  S)  - (X,  Y,  Z) 


where 
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Y = A/2*  (angle  in  terms  of  revolution) 


2 = S/AL,  -0.5  < 2 < 0.5  for  a simple  curve  segment 


An  experiment  was  designed  and  carried  out  through  the  following 
steps  to  study  the  distribution  of  the  normalized  variables  under  dif- 
ferent rotations. 

Experiment  3^.J_: 

1.  A picture  with  a clear  boundary  was  scanned  with  respect 
to  8 various  rotation  angles. 

2.  Shapes  on  the  digital  pictures  were  traced  out  and  passed 
through  a smoothing  procedure.  The  vector  chains  were  ob- 
tained. (Boundary  tracing  and  smoothing  will  be  discussed 
in  Chapter  4.) 

3.  Manual  extraction  of  primitives  from  the  chain  was  per- 
formed via  an  interactive  procedure. 

4.  The  descriptors  of  the  manually  extracted  primitives  were 
computed  and  transformed  by  N. 


5.  Studied  the  distributions  of  the  normalized  variables. 


The  three  normalized  variables/  X,  Y,  l,  constitute  a three- 
dimensional  space,  named  3-D  for  short.  Table  3.1  illustrates  the  pic- 
tures used,  the  curve  primitives/  and  the  corresponding  symbols  in  Fig- 
ures 3.9-3.11/  which  show  the  two-dimensional  displays  of  the  distribu- 
tions of  descriptors.  Each  symbol  in  the  figure  indicates  the  position 
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of  a descriptor  in  3-D. 

Several  interesting  aspects  are  observed: 

(1)  The  3 variables  well  characterize  the  shape.  Any  pair  of  clusters 
can  be  separated  in  at  least  one  of  the  2-dimensional  displays. 

(2)  The  number  of  points  is  insufficient  to  reveal  a parametric  distri- 
bution, but  the  points  within  each  cluster  are  considerably  close 
together. 

c 

(3)  The  variable  Z = ^-  is  more  spread-out  compared  to  the  other  two. 
The  reason  for  this  might  be  that  the  noise  in  A and  L are  accumu- 
lated in  the  calculation  of  S,  which  is  the  summation  of  the  par- 
tial products  of  A and  L. 

This  experiment  does  not  adequately  demonstrate  the  distribution. 
Besides,  the  distribution  could  be  changed  with  the  boundary  smoothing 
techniques.  However,  this  experiment  gives  us  the  idea  to  construct  a 
similarity  measurement.  In  the  experiment,  we  also  noticed  that  the 
noise  at  the  breaking  points  had  a much  greater  effect  on  the  attributes 
than  that  at  the  middle  of  the  curve  segments. 

From  the  above  study  of  3-dimensional  ncrmalized  space,  we  may  as- 
sume that  the  clusters  in  4-dimensional  transformed  attribute  space  are 
also  well  separated.  Thus,  we  suggest  recognizing  a curve  primitive  by 
means  of  a distance  measure  in  the  4-dimensional  transformed  attribute 
space.  Without  losing  generality,  we  assume  each  curve  primitive,  Q, 
has  a reference  point,  T(Q),  in  the  4-dimensional  transformed  space.  If 
there  is  a curve  segment,  q,  whose  transformed  C-descriptor,  T(q),  is 
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relatively  close  to  T(Q)  in  the  4-dimensional  space,  q is  recognized  as 
Q.  In  other  words,  there  is  a recognition  function  RQ,  if  RQ(q)  < tfl, 
tQ  is  a threshold,  q - Q.  In  general,  RQ  can  be  a distance,  similarity, 
or  probability  function  dependent  on  Q.  We  could  rewrite  RQ(q)  as 
R(Q,q) . 

The  recognition  of  the  angle  primitive  is  similar,  but  simpler. 
For  an  angle  primitive  A,  there  is  a function  R^.  For  any  angle  a, 
R^(a)  < t^,  a - A.  R^(a)  can  be  rewritten  as  R(A,a).  Theoretically, 
the  angle  primitive  has  no  length.  Since  sharp  corners  are  often 
smoothed  by  noise,  we  allow  a short  length  for  angle  primitives  as 
called  "corner  tolerance".  Of  course,  it  is  possible  to  employ  the  con- 
cept of  partial  recognition,  or  recognition  with  probability  p, 
0 £ p £ 1,  instead  of  "yes"  and  "no"  for  both  curve  and  angle  primi- 
tives. 

Definition  3.10;  A - B implies  that  A and  B are  recognized  as  the  same. 
A and  B can  be  primitives,  nonterminals,  or  vector  chains. 

Let  us  extend  the  above  definition  to  a string  of  primitives,  non- 
terminals, or  vectors. 

Definition  3.11:  X.X0  ...  X - v.  ...  v.,  the  vector  chain  v.  ...  v.  is 
recognized  as  a string  of  primitives  or  nonterminals,  X^X^  •••  X^,  iff 
there  is  a feasible  segmentation  on  vector  chain  at  vectors  v^p,  v^, 
...,  vko,  where  kO  = i,  ka  = j,  such  that  vk(p_.jj  •••  vkp  - Xp,  for 
2 < p < a. 


l 
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^.6  Parsing  Schemes 

The  primitives  can  be  extracted  on  the  basis  of  the  recognition 
functions  suggested  in  Section  3.5.  If  this  can  be  done  without  knowing 
the  context  before  parsing,  then  an  input  pattern  can  be  represented  as 
a string  of  primitives  and  an  ordinary  parser  can  be  used  to  analyze  the 
syntax  of  the  primitive  string  according  to  a given  grammar.  This  is 
the  normal  procedure  of  the  syntactic  approach.  An  ordinary  parser  con- 
structs a derivation  tree  which  derives  the  input  primitive  string  using 
the  production  rules  of  the  given  grammar.  (See  Figure  3.12.)  But,  any 
context  information  can  certainly  help  to  reduce  ambiguities  in  primi- 
tive extraction.  Let  us  look  at  the  following  examples. 

Example  3.4:  Figure  3.6  is  the  top  view  of  a typical  airplane.  The  lo- 
cal descriptions,  or  the  boundary  subchains,  around  the  right  tail  and 
the  top  of  the  right  wing  are  very  similar.  It  is  difficult  to  deter- 
mine whether  the  short  vertical  segment  is  primitive  p^  or  part  of  prim- 
itive p.j2  unless  we  look  ahead  to  see  p^,  or  the  whole  p^. 

Example  3.5:  Two  shape  subpatterns  in  Figure  3.13  can  be  described  as 
-►  p^b^p^  and  ♦ p^b^p^  respectively.  The  transformed  descriptors 
of  p^  and  p^  are  very  similar.  So  are  those  of  b^  and  b£.  Let  us  try 
to  recognize  a noisy  vector  subchain  shown  ir.  Figure  3.14.  v^v^  is  very 
close  to  p^.  If  we  extract  v^v^  as  p^,  the  recognition  procedure  may 
stop  at  Vj  after  aj  is  recognized  as  b^.  No  primitive  can  be  extracted 
starting  from  v^.  If  we  extract  v^VjVj  as  P3/  t^1en  we  can  proceed  to 
extract  a^  as  b^  and  v^v^  as  p^.  But  the  machine  will  not  know  that  the 
extraction  of  v^v^  as  p^  is  incorrect  until  v^  and  v^  are  observed.  The 
description  of  may  be  closer  to  p^  than  v^v^Vj  is  to  p-j. 


Figure  3.13  Two  Shape  Patterns  and  Two  Productions 


Figure  3.15 

Recognition  of  a Noisier 
Vector  Subchain 


Figure  3.14 

Recognition  of  a Noisy 
Vector  Subchain 
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Example  3.6:  Suppose  that  the  machine  is  asked  to  recognize  a noisier 
pattern  such  as  Figure  3.15  versus  Figure  3.13  (a)  and  (b).  The  noisy 
pattern  has  13  vectors.  In  order  to  allow  some  noise,  the  recognition 
function  cannot  be  very  selective.  There  may  be  several  candidates  for 
each  primitive. 


Primitive 


Candidates 


u!  V U1  **•  u6 


U1  '**  u5'  U1  ***  V U1  *"  V U1  ***  u8 


P2  or  p^ 


b^  or  b^ 


u9  u13'  u8  *•*  u13 


u6  •••  V U8V  u7  •••  u9 


The  candidates  for  angle  primitives  must  have  close  angle  values  and 
must  be  shorter  than  corner  tolerance  in  length. 

If  we  check  all  the  candidates  with  the  production  rules,  we  find 
that  there  are  only  three  possible  combinations,  or  feasible  segmenta- 
tions. They  are  u^  •••  u^  •••  Ug  •••  u^  for  N^,  and  u^  •••  u^  ••• 
Up  •••  u^,  u.j  •••  UgUp  •••  u^  for  N^.  A feasible  segmentation 
u.|  •••  u^  •••  u9  •••  u^  means  that  u^  •••  u^,  u7  •••  Up,  and  Up  •••  u^j 
are  extracted  consistently  as  three  primitives  of  If  we  check 

the  descriptors  of  the  whole  vector  chain  with  those  of  and  N^,  we 
find  U1  •••  ufi  •••  Ug  •••  u^  cannot  be  because  of  the  total  ancle 

and  the  vector  i ;ngth  of  I u . . Therefore,  the  unknown  noisy  subch  in 

i=1  1 

is  recognized  as  through  two  feasible  segmentations,  u^  •••  u7  ••• 
u9  •••  u^  and  u^  •••  UgUp  •••  u^j. 


ismmm 
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The  above  examples  show  that  the  ambiguity  of  primitive  extraction 
can  be  resolved  by  using  contextual  information.  The  contextual  infor- 
mation which  is  described  by  production  rules  is  used  in  parsing. 
Therefore,  if  the  primitives  are  extracted  during  the  parsing,  the  ex- 
traction can  be  improved.  Based  on  this  idea,  we  have  developed 
primitive-extraction-embedding  (PEE)  parsers.  A PEE  parser  performs  the 
job  of  recognizing  primitives  and  nonterminals  during  the  parsing.  In 
Figure  3.16,  the  dashed  blocks  indicate  recognition  functions.  The  idea 
is,  if  more  than  one  candidate  can  pass  through  the  recognition  function 
for  a primitive  or  nonterminal,  instead  of  making  the  selection  immedi- 
ately, the  machine  looks  ahead  to  the  context,  checks  the  production 
rules,  and  then  discards  inappropriate  candidates. 

As  we  can  see  in  Figure  3.15  and  Example  3.6,  the  segmentation  of  a 
noisy  boundary  is  fuzzy  in  the  sense  that  it  is  difficult  to  find  a de- 
finitely correct  breaking  point.  Even  a human  can  hardly  break  the 
noisy  vector  chain  exactly.  Therefore,  the  recognition  of  Figure  3.15 
in  Example  3.6  succeeded  with  two  feasible  segmentations  on  the  noisy 
boundary  corresponding  to  one  segmentation  on  the  true  pattern  (see  Fig- 
ure 3.13(b)).  The  feasible  segmentations  on  the  noiiy  boundary  are 
called  noisy  boundary  segmentations  (NBS's)  and  the  segmentation  on  the 
true  boundary  is  called  a reference  segmentation.  It  is  difficult  to 
decide  which  of  the  two  NBS’s  is  more  accurate  than  the  other.  There- 
fore, we  accept  both  of  them  as  correct  segmentations.  There  might  be 
more  NBS's  corresponding  to  one  reference  segmentation  if  the  vector 
chain  is  very  noisy  and  the  shape  is  complex. 
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To  demonstrate  the  feasibility  of  PEE/  we  have  modified  two  parsing 
algorithms  to  embed  the  primitive  extraction.  They  are  Earley's  parser 
£433  for  context-free  grammars  and  finite  automata  C43D  for  finite-state 
grammars.  Earley's  parsing  algorithm  consists  of  two  parts:  parsing 
table  generation  and  parse  extraction  from  the  table.  For  classifica- 
tion purpose,  the  first  part  is  sufficient.  Therefore,  we  have  only 
modified  the  first  part  of  the  algorithm.  The  flow-chart  of  the  modi- 
fied algorithm  is  shown  in  Figure  3.17.  The  grammars  used  are  in 

context-free  form  as  described  in  Section  3.4. 

The  major  modifications  are  that  (1)  the  indices  are  now  pointing 
vectors  instead  of  primitives,  and  (2)  the  recognition  functions  are 
properly  added  for  extracting  primitives.  In  the  following  algorithm; 

v.  ...  v is  the  unknown  vector  chain.  T(X)  denotes  the  transformed 
i m 

descriptor  of  a primitive  or  nonterminal,  X e (V^ JJ  “ S^,  T < i , j ) 

denotes  the  transformed  descriptor  of  the  subchain  v^v^+^  ...  Vj . For 
the  angle  primitive,  T(i,j)  designates  the  angle  change  from  v.  to  v^ 
where  the  curve  length  from  v.+,j  to  v^_^  is  shorter  than  the  corner 

tolerance.  If  j = i+1,  the  curve  length  for  the  angle  is  zero. 

I1  ...  1^  are  the  parse  lists.  T(X)  = T(k,j)  implies  that  vk  ...  v^ 

. . r 

is  recognized  as  X,  or  say,  v^  ...  v^  - X. 

Algorithm  3.1 : The  PEE  Earley's  Parser 

Input:  A context-free  shape  grammar  and  an  unknown  chain  of  m vectors. 


Output:  "Accept"  or  "Reject" 


Method: 


(1)  Add  CS  - • a, 13  to  I,,  for  all  S •*  a in  P^ 
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(2)  (a)  If  CN  ♦ a • B0,i3  is  in  I.  and  B ♦ y in  P 

3 * 

then  add  CB  ♦ • Y,j]  to  1^ 

(b)  If  CN  -»  a • ,i]  is  in  I. 

J 

then  for  at  t CB  ♦ B • N y,kl  in  1^ 
add  CB  + bN  • f,k]  to  1^ 

(3)  j = j+1 

if  j > m+1  goto  (5) 

For  al l CN  + a • Xe, i 3 in  1^,  1 < k < j 
X e {F's,A's> 

(a)  If  B * \ and  T(X)  * T(k,j) 

then  add  CN  ♦ aX*B,i]  to  I. 

3 

(b)  If  B = T(X)  » T(k#j ) and  T(N)  a T(i,j) 

then  add  CN  + aX*,i]  to  I. 

3 

(A)  Go  to  (2) 


j"l 

For  all  S -»a  in  P 

add  [S>»a, 1 ] to  1 1. 


For  lj,(a)  If  in  lj,  and  B-*y  in  P 

add  [B-- y ,j]  to  lj. 

(b)  I f [N->n*,  i ] in  I : 

then  for  all  [ B-*-l< • N-y , k ] in  lj 
add  [B>-hN*Y  ,k]  to  L. 


For  a 1 1 

-t 

c 

>< 

1 i_k*_j,  Xe  IF's.A's 

(a) 

If  B-A,  T(x)  ■ T(k,j), 

and  T(N)  «T(i,j) 

add  [N  HiX*  , i ] to 

1 i* 

(b) 

If  B+  and  T (X)  = T(k,j) 

add  [NntX-B.i]  to  1 

. • 

Figure  3.17  The  Flow-Chart  of  PEE  Earley's  Algorithm 
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Therefore,  if  CS  + a*,  1]  is  in  I ^ for  some  a,  there  exists  at 
(east  one  noisy  boundary  segmentation  corresponding  to  a reference  seg- 
mentation derived  from  S * a. 

The  following  example  illustrates  how  the  contextual  information  is 

used  to  help  extract  primitives  in  Earley's  algorithm. 

Example  3.7:  Let  us  look  at  the  step  C3.b).  In  extracting  F1  of 

CN  ♦ o • F.j,  igT  in  IR  from  the  vector  chain,  T(F^)  » T(Kq,j)  and  T(N) 

=■  T(i,j)  may  be  true  for  both  j = and  j^.  In  other  words,  subchains 

v„  — v.  and  v„,  — v.  are  candidates  for  F.,  so  [N  ♦ a F.  • in]  is 

3-|  ^0^2  1 i u 

in  I.  and  [N  ♦ a F • , i_]  is  in  I.  . Suppose  that  j.  < j?.  After 

Jl  1 u ^2  1 c 

the  execution  of  step  (2)  for  j = j?,  [B  ♦ B N • y,  K.]  is  in  I.  and 

[B  ♦ B N • y,  is  in  Ij  where  < i©  < Kg  < < j£*  Suppose  that 

Y = ^iF2A2F3’  execut’°"  step  (3. a)  to  extract  of 

CB  ♦ B N • A.F,A,F,,  K.  ] in  I.  and  I.  from  the  vector  chain  may  reveal 
lcci  i 3 2 

that  T(A^)  * T( j ^ , j j)  for  any  jj  > j^,  but  T(A^)  « T(j2,j^)  for  some 

H > ^2*  T^en  0n*y  CB  * B N A^  • F2A2F3'  ’S  added  to  Ij  * That  ’s' 
the  context  information  is  used  to  select  the  subchain  v„  — v.  for  F 


]■ 


1 


and  discard  v„ 


— v j . If  T< A^ ) =>  T(j^,jj)  for  some  > j^,  and  T(A^ ) 

* T< j 2,j for  some  > j2,  then  [B  + B N A1  • F2A2F3'  is  in  Ij 

and  [B  -»  B N • F2A2F3'  ™ Ij  * execution  of  step  (3. a)  to 

extract  F2  from  the  vector  chain  may  reveal  that  T< F^)  * T(jj,jj)  for 

any  2.  ^3  > ^2'  *Jut  T*F2*  a T^4'^6*  For  some  ^6  — ^4  > ^1*  Then  on^y 

CB  ♦ B N A.F_  • A,F,,  K.]  is  added  to  I.  . That  is,  the  lookahead  on 


the  information  of  subchain  v.  — v.  selects  the  candidates  v„ 


'1 


^5 


— v. 


for  Fj. 


— * V 


jl. 


! 

I 
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In  fact,  the  extraction  of  A's  and  F's  embedded  in  the  parsing  is 
different  from  the  pre-extraction  of  the  primitives  done  without  knowing 
the  contextual  information.  The  advantage  is  that  the  extraction  would 
be  much  more  accurate  in  a global  sense. 

As  illustrated  in  Example  3.1  and  Figure  3.6/  each  nonterminal  is 
semantically  significant  and  is  described  by  the  attributes.  In  PEE 
Earley's  algorithm,  the  recognition  of  both  nonterminals  and  primitives 
is  performed.  But,  the  recognition  of  nonterminals  is  sometimes  un- 
necessary, when  the  primitive  recognition  and  syntax  analysis  are  suffi- 
cient to  discriminate  the  classes.  In  general,  the  languages  generated 
by  our  context-free  shape  grammars  can  also  be  generated  by  finite-state 
grammars.  But,  the  nonterminals  in  finite-state  shape  grammars  are  not 
very  significant  in  semantics.  In  other  words,  we  use  a context-free 
grammar  to  describe  a shape,  because  we  like  to  take  advantage  of  the 
context-free  form,  not  because  the  corresponding  language  has  to  be  gen- 
erated by  a context-free  grammar. 

Remark;  The  context-free  shape  grammar  (CFS6)  and  finite-state  shape 
grammar  (FSS6)  are  referred  to  the  shape  grammars  in  context-free  and 
finite-state  forms,  respectively. 

For  problems  in  which  nonterminal  recognition  is  not  necessary  only 
finite  state  grammars  are  used.  Therefore,  we  also  developed  a PEE  fin- 
ite automaton.  Since  we  can  always  find  an  angle  primitive  following  a 
curve  primitive,  we  consider  that  each  time  the  input  contains  a curve 
primitive  and  an  angle  primitive.  Figure  3.18  shows  the  storage  of  a 
finite-state  grammar  in  a structural  form.  The  automaton  with  its 
flow-chart  shown  in  Figure  3.19,  uses  a STACK.  Each  element  in  the 


M 
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STACK  contains  two  fields,  state  and  vtpt,  which  means  the  first  vtpt-1 
vectors  of  the  unknown  shape  have  been  accepted  through  the  state.  FS 
is  a set  of  final  states. 

Algorithm  3.2;  The  PEE  Finite-State  Automaton 

Input:  A finite-state  shape  grammar  in  tabular  form  (Figure  3.18)  and 
an  unknown  chain  of  m vectors. 

Output:  "Accept"  or  "Reject" 

Method: 

(1)  kp  * 1 

STACK  (kp)  ♦ (S^l) 

(2)  If  kp  = 0 then  terminate  with  "Reject" 
otherwise  s ♦ state  (STACK  (kp)) 

\ 

p ♦ vtpt  (STACK  (kp)) 
kp  ♦ kp  - 1 
tp  «•  PTR  (s) 

(3)  If  tp  = 0 GOTO  (2) 

* 

04)  nxp  ♦ nxpt  (TABLE  (tp)) 

F ♦ curve  (TABLE  (tp)) 

> t 

A * angle  (TABLE  (tp)) 

*»  V- 

nxs  «•  nxst  (TABLE  (tp)) 

(5)  For  all  x,y,  P £ x < y < m+1 

If  (T(p,x)  * T(F)  and  T(x,y)  * T(A) ) then 
if  nxs  e FS  and  y = m+1  then  GOTO  (7) 
otherwise  kp  ♦ kp+1,  STACK  (kp)  * (nxs,y) 


(6)  If  nxp  = 0 then  GOTO  (2) 
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otherwise  tp  ♦ nxp,  GOTO  (4) 

(7)  Terminate  with  "Accept" 

To  extract  primitives  in  parsing,  there  are  sometimes  many  candi- 
dates for  a primitive.  In  PEE  Earley's  parsing,  the  items  corresponding 
to  these  candidates  will  be  added  appropriately  to  the  parse  lists  ac- 
cording to  the  vector  indices,  and  their  subsequent  extractions  will  be 
processed  parallelly.  In  PEE  finite  automaton  the  states  and  vector  in- 
dices corresponding  to  these  candidates  will  be  pushed  down  the  stack, 
and  their  subsequent  extractions  will  be  processed  one  at  a time.  The 
PEE  Earley's  parser  stops  when  the  vector  indices  reach  the  last  vector. 
If  there  is  more  than  one  feasible  noisy  boundary  segmentation,  e.g.. 
Example  3.6,  they  will  all  be  found  at  the  end  of  processing.  The  PEE 
finite  automaton  stops  whenever  one  noisy  boundary  segmentation  is 
found.  In  strategy,  the  implementation  of  PEE  finite  automaton  is  some- 
what like  a bottom-up  backtrack  parsing. 

The  PEE  Earley's  algorithm  basically  implements  a breadth-first 
search,  while  the  PEE  automaton  implements  a depth-first  search.  Since 
the  depths  for  both  searches  are  the  same,  m+1,  the  PEE  automaton  is 
less  time  consuming.  They  both  search  for  feasible  noisy  boundary  seg- 
mentations which  satisfy  the  production  rules  and  descriptors.  The  au- 
tomaton recognizes  the  primitives,  while  the  Earley's  algorithm  recog- 
nizes the  nonterminals  as  well  as  the  primitives.  If  we  abandon  the 
recognition  of  nonterminals  in  the  Earley's  algorithm,  the  two  algo- 

I 1 

rithms  will  end  up  with  the  same  classification  result.  But,  the  auto- 
maton would  be  faster,  because  it  stops  at  the  first  feasible  set  of 

primitives  found.  However,  the  recognition  of  nonterminals  upgrades  the 

I 


discriminating  power  of  the  Earley's  algorithm 


2-7.  Cl  assi  f ication  Tree 

As  mentioned  before,  the  four  attributes  do  not  uniquely  character- 
ize a curve  segment.  The  curve  segments  in  Figure  3.20  (a)  and  (b)  have 
the  same  C-descriptor.  If  the  difference  between  the  two  figures  is 
caused  by  noise,  then  the  insensitivity  of  the  descriptor  completely  ig- 
nores that  noise.  If  the  difference  is  significant  in  discriminating 
between  the  two  classes,  we  can  decompose  Figure  3.20(b)  into  two  short- 

t I 'TN  l 

er  curve  segments  at  point  Xj.  and  X^X^  are  assigned  to  be  primi- 
tives p.j  and  Pj.  respectively.  Figure  3.20(c)  shows  the  decomposition. 
The  recognition  of  p^  and  P2  can  definitely  discriminate  Figure  3.20(a) 
from  (b). 

If  we  are  only  interested  in  classification,  we  can  use  very  simple 
shape  grammars  which  check  only  some  significantly  different  parts  of 
the  pattern  for  classification.  There  is  no  need  to  describe  the  com- 
plete shape  in  detail.  Let  us  assign  curve  X^X^  to  be  primitive  p^. 
Figure  3.20(b)  can  be  recognized  as  p^  and  p£,  or  only  p^,  but  Figure 
3.20(a)  can  only  be  recognized  as  p^-  This  situation  suggests  a deci- 
sion hierarchy.  We  can  use  p^  to  accept  Figure  3.20(a)  and  (b)  at  the 
first  decision  stage,  and  use  p^a^p^  to  accept  Figure  3.20(b)  at  the 
second  decision  stage.  Hence,  a decision  tree  is  built  as  in  Figure 
3.21. 

The  above  description  suggests  the  possibility  of  constructing  a 
classification  tree  for  multi-class  problems  with  several  simpler 
discriminating  grammars  at  different  stages.  From  another  point  of 
view,  a classification  tree  is  a top-down  parsing,  if  only  one  level  of 
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N 


P,  a,  p2 

(c) 


Figure  3.20  Two  Different  Curve  Segments  May  Have  The 
Same  C-Descriptor 
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production  rule  is  used  at  each  discriminating  grammar. 

Though  our  shape  grammars  are  formulated  as  attributed  grammars 
with  most  of  the  attributes  calculated  upwards  from  the  bottom.  Theorem 
C states  that  it  is  not  necessary  to  compute  the  attributes  of  primi- 
tives prior  to  those  of  nonterminals.  We  can  always  compute  the  attri- 
butes for  a curve  segment  as  long  as  we  know  the  vector  chain.  For  in- 
stance, we  can  compute  the  attributes  for  the  nonterminal  N,  in  Figure 
3.20(b),  without  computing  for  p^  and  p^  first. 

Z.8  Pi scussion 

The  proposed  method  applies  the  syntactic  technique  in  the  general 
shape  recognition  problem.  Other  existing  syntactic  shape  recognition 
methods  have  both  merits  and  demerits.  The  merits  are  (1)  the  global 
shape  structure  is  explicitly  described  by  production  rules,  and  (2)  the 
parsing  stage  only  takes  care  of  the  syntax  information,  so  that  the 
parsing  time  for  patterns  without  error  is  fast.  The  demerits  are  (1) 
the  primitive  set  is  rather  problem-dependent,  (2)  the  specifications  of 
primitives  are  heuristic  and  primitive-dependent,  (3)  the  extraction  of 
primitives  has  to  be  done  before  parsing,  and  therefore,  it  is  heuristic 
and  inaccurate,  and  (4)  nonterminals  carry  only  syntax  information. 

The  attributes  we  proposed  carry  the  semantic  information  which 
gives  a clear  idea  of  the  curve  segment,  if  it  is  a simple  curve  seg- 
ment. For  a more  complex  curve  segment,  the  first  three  attributes,  t, 
L,  A,  still  can  roughly  characterize  it.  Thus,  our  method  has  the  fol- 
lowing merits:  (1)  Production  rules  describe  the  shape  structure  expli- 
citly, (2)  The  primitive  description  method  can  handle  a large  number  of 
primitives,  (3)  The  primitive  description  method  is  systematic,  general. 
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and  problem-independent,  (4)  The  primitive  extraction  utilizes  both  se- 
mantic and  syntactic  information,  namely,  attributes  and  production 
rules  respectively.  Hence,  the  primitive  extraction  is  systematic  and 
accurate  in  a global  sense,  (5)  The  whole  system  is  well  structured  so 
that  any  particular  part  can  be  modified  for  special  applications.  For 
example,  the  attribute  set,  attribute  rules,  transformation,  and  recog- 
nition function  are  modifiable,  (6)  With  context-free  shape  grammars, 
CFSG,  a shape  can  be  decomposed  hierarchically  so  that  each  nonterminal 
represents  a part  of  the  shape  which  is  meaningful  in  human  perception. 
Then,  the  recognition  of  nonterminals  increases  its  discriminating 
power,  and  (7)  With  our  primitive  description  method,  the  primitives  can 
be  more  sophisticated  so  that  we  don't  have  to  use  context-sensitive 
grammars,  CSG. 

If  we  use  fixed-length  curve  primitives  and  delete  the  S attribute, 
then  our  method  would  become  a general  form  for  several  existing  syntac- 
tic applications.  For  example,  in  the  chromosome  recognition  problem 
[5],  the  primitives  a,  b,  c,  d,  e,  are  shown  in  Figure  3.22.  A simpler 
example  is  the  generation  and  recognition  of  squares  C 5 □ where  the  prim- 
itives are  unit  vectors.  Since  the  primitives  have  eaual  lengths,  they 
can  be  characterized  by  f and  A.  In  such  cases,  the  pattern  representa- 
tion would  not  be  size-invariant  and  we  would  need  a context-sensitive 
grammar  to  recognize  the  same  shape  with  different  sizes.  In  our  ap- 
proach, the  S attribute  allows  various  declinations.  Size  normalization 
with  division  by  normalization  factor  Lq  makes  the  descriptor  size  in- 
variant, and  hence,  frees  us  from  the  need  of  using  CSG  in  this  respect. 
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But  for  patterns  in  which  some  parts  may  have  different  lengths  in 
proportion,  division  by  total  length  does  not  solve  the  problem.  For 
example,  a submedian  chromosome  has  two  arm  pairs,  each  arm  pair  having 
two  arms  of  equal  length.  Figure  3.23(a)  shows  the  segmentation  with 
conventional  primitives  and  (b)  shows  one  possible  segmentation  with  our 
proposed  shape  primitives  where  a is  the  angle  primitive  and  s,  f,  t,  c 
are  the  curve  primitives.  One  possible  sentential  form  is 

f alaCalaf aSaCaSa 

s 

with 

DU)  = 0 

DU)  = (t£,  L^,  A£,  S£) 


D(f)  = (tf,  Lf,  Af , 0) 

' i < Af  < 0 

D(s)  = ds,  Ls,  it,  0) 

D(c)  = (t  , L , -w,  0) 
c c 

Since  Cq  = |tc|  is  one  of  values  which  does  not  vary  with  the  length  of 
the  arms,  we  may  use  C as  the  si2e  normalization  factor  and  modify  the 


. r.  ' 


Figure  3.22  Conventional  Fixed-Length  Primitives  for 
Chromosome 


Figure  3.23  The  Segmentation  of  a Submedian  Chromosome 
with  (a)  Fixed-Length  Primitives  and 
(b)  The  Proposed  Shape  Primitives 
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T transformation  to  T . 

c 


T : (t,L,A,S)  - d&,  L A, 


C ' T'  TT' 
0 0 


So  that 


lCt'  Lt 

T.(D(l))  = 0 , „ -jA) 


I Cf I Lf 

Tc(D(f>)  = 0 , A , -^1) 

o o 


ICS«  Ls 

Tc(0(s))  = C— / 0,  „ -ji) 
o o 


Tc(D(c))  = (1,  0, 


ICJ  |Cf|  |CJ  L 


s'  -f  c 

> -f—t  T~r  and  are  theoretically  invariant  under 


T~'  T 

0 O OOO 

translation,  rotation,  and  scaling.  and  l_s  are  subjected  to  change 


with  the  length  of  the  arms.  If  L « L , it  is  median.  If  L i L , it 

is'  is' 

is  submedian.  (=<  means  "nearly  equal"). 

The  above  discussion  implies  that  we  may  not  need  CSG  in  solving 
many  shape  recognition  problems,  if  a proper  transformation  T can  be 
found.  Obtaining  the  proper  transformation  seems  to  be  problem  depen- 


*1—  * 


-83 


dent.  The  major  purpose  of  this  transformation  is  to  eliminate  the  size 
problem.  Therefore,  the  transformation  is  not  difficult  to  obtain,  if  a 
size  normalization  factor  is  known.  However,  the  shape  grammar  we  pro- 
posed has  the  potential  of  solving  a general  class  of  shape  recognition 
problems  without  requiring  CSG. 
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CHAPTER  4 


AN  EXPERIMENT  OF  AIRPLANE  SHAPE  RECOGNITION 

j4._1^  Introduction 

Like  other  existing  methods,  the  proposed  method  is  designed  on  the 
basis  of  two-dimensional  images.  Although  the  objects  to  identify  are 
usually  three-dimensional,  our  recognition  has  to  be  based  upon  what  we 
can  get  from  economic  automatic  visual  equipment. 

The  purpose  of  this  experiment  is  to  demonstrate  the  capability  of 
our  shape  recognition  method.  Since  the  proposed  syntactic  method  can 
distinguish  the  shapes  by  structure  as  welt  as  by  boundary  details,  we 
choose  airplane  models  for  our  experiments  because  some  airplane  shapes 
are  completely  different  in  structure,  e.g.  B52  and  F102,  and  some  are 
very  similar  in  structure  but  slightly  different  in  such  details  as 
tails  or  wings,  e.g.  MIG-15  and  F86.  These  models  can  be  found  in  Fig- 
ure 4.1. 

The  computer  preprocessing  consists  of  the  following  steps:  digiti- 
zation, threshold  selection,  boundary  following,  and  smoothing.  Before 
these  steps,  however,  we  have  to  prepare  the  airplane  models  and  take 
analog  pictures.  The  model  preparation,  analog  picture  taking  and  di- 
gitization will  be  discussed  in  Appendix  A.  In  the  following  sections, 
we  will  discuss  the  series  of  steps  from  threshold  selection  to  recogni- 
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between  the  first  and  second  peaks  as  the  threshold  for  finding  white 
blood  cell  boundaries.  For  red  blood  cells,  the  same  technique  can  be 
used  to  smooth  out  the  histogram  until  there  are  only  two  peaks.  Unfor- 
tunately, this  method  does  not  work  as  well  on  our  airplane  images,  be- 
cause the  small  peak,  in  the  histogram,  corresponding  to  the  object  may 
be  smoothed  out . 

Because  of  the  flat  black  paint  on  the  models,  the  airplanes  look 
uniformly  dark.  This  characteristic  should  create  a peak  in  the  high 
gray  level  region  of  the  histogram.  This  peak  is  sometimes  very  small 
because  some  angle  view,  e.g.  the  front  view,  of  the  airplanes  occupies 
a very  small  area  of  the  image.  The  light  background  is  supposedly  uni- 
form too.  So,  the  highest  peak  in  the  histogram  often  corresponds  to 
the  background.  Although  the  background  may  not  be  absolutely  uniform, 
the  uniformity  of  the  background  is  better  than  that  of  the  object  due 
to  the  slight  reflection  of  the  object  surface.  Therefore,  the  peak 
corresponding  to  the  background  in  the  histogram  usually  has  a steep 
down  slop  in  the  higher  gray  level  side.  Thus,  we  have  the  following 
algorithm  for  obtaining  the  threshold. 

Algorithm  4.1;  Threshold  Selection 
Input:  A digital  picture. 

Output:  A threshold  t. 

Method: 

(1)  Compute  the  histogram. 

(2)  Find  the  peak  of  the  highest  gray  level  (which  usually 
corresponds  to  the  object).  Let  k^  = the  gray  level. 

(3)  Find  the  maximum,  the  highest  peak  of  the  histogram  besides 
the  one  found  in  (2).  Let  k^  = the  corresponding  gray  level. 
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(4)  Find  the  lowest  valley  between  the  above  two  peaks  and  let  k^ 
= gray  level  corresponding  to  the  valley. 


(5)  Find  the  bottom  of  the  sharpest  down  slope  of  the  histogram 


between  k^  and  k^.  Let  t = the  corresponding  gray  level. 


(6)  Terminate. 


This  algorithm  is  efficient  and  accurate  in  selecting  threshold  for 


pictures  with  following  two  characteristics,  (1)  good  contrast  between 


object  and  background,  and  (2)  the  uniformity  of  light  background  is 


better  than  that  of  the  dark  object. 


After  a threshold  is  found,  the  boundary  can  be  traced  out.  The 


boundary  may  be  defined  in  two  different  ways:  (1)  the  connection  of  the 


outermost  pixels  of  the  object,  and  (2)  the  connection  of  the  edges 


between  the  object  and  the  background.  The  boundary  by  the  first  defin- 


ition can  be  coded  by  the  octal  Freeman  chain  codes  C42,59D.  Under  this 


definition  a narrow  bar  shape  with  a width  of  one  pixel  will  be  coded  as 


zero  pixel  wide.  Since  we  hope  to  extract  the  correct  semantic  informa- 


tion, including  width  from  the  boundary,  we  select  the  second  defini- 


tion. Namely,  our  boundary  is  a connection  of  edges  between  the  object 


and  the  background.  This  boundary  can  be  coded  by  unit  vectors  with 


horizontal  and  vertical  directions. 


Sidhu  and  Bonte  C60]  proposed  to  encodr  a binary  picture  with  con- 


tainment codes  via  a 2x2  window.  And  then,  they  used  the  codes  to  lead 


the  boundary  following  successfully.  Actually,  the  boundary  following 


is  led  by  the  contents  in  the  2x2  window.  We  have  developed  an  algo- 


rithm which  utilizes  the  contents  in  the  window  directly  to  find  the 


boundary.  No  conversion  to  a binary  picture  nor  encoding  with  contain- 


ment codes  is  necessary.  Our  window  is  described  dynamically  in  terms 


— 
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of  the  boundary  vectors.  Figure  4.2  shows  the  four  possible  configura- 
tions. The  pixels  A,  B,  C,  and  0 are  defined  relative  to  the  boundary 
vector  P.  The  object  is  to  the  right  of  P,  so  that  A is  darker  than  the 
threshold  t.  The  background  is  to  the  left  of  P,  so  that  C is  lighter 

than  t.  In  the  following  algorithm,  u and  u are  the  unit  movements, 

x y 

or  unit  vectors,  in  the  X and  Y directions  respectively.  A,  B,  C,  D 
denote  the  coordinates  of  the  pixels.  G(B)  is  the  gray  level  of  the 
pixel  indicated  by  B. 

Before  using  this  algorithm,  the  picture  is  scanned  from  left  to 

right  and  from  top  to  bottom  to  find  the  first  pixel  F,  which  is  darker 

than  the  threshold  t. 

Algorithm  4.2:  Boundary  Following 

Input:  F,  u , u , and  threshold  t. 
x y 

Output:  A boundary  chain  U of  i unit  vectors. 

Method : 

(1)  Set  P = u , i = 1,  U(1)  = P 

x 

A = F,  C = F-uy,  S = C,  go  to  (3) 

(2)  If  (S  = C)  then  terminate 
otherwise  i = i+1,  U(i)  = P 

(3)  D = C+P 

If  (G(D)  < t)  then  go  to  (4) 
otherwise  P = C-A,  A = D,  go  to  (2) 

(4)  B = A+P 

If  (G(B)  < t)  then  go  to  (5) 
otherwise  A = B,  C = D,  go  to  (2) 

(5)  C = B,  P = B-D  go  to  (2) 
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This  algorithm  is  very  efficient  because  only  the  pixets  next  to 
the  boundary  are  processed.  The  computation  time  is  proportional  to  the 
number  of  vectors  in  the  boundary.  The  time  required  for  a boundary  of 
1000  unit  vectors  is  only  about  0.6  second  on  a POP  11/45  with  auxiliary 
memory  for  pictures.  The  algorithm  was  implemented  in  FORTRAN  language. 
If  we  want  to  take  samples  every  k pixels  horizontally  and  every  k 

X 7 

pixels  vertically,  we  only  need  to  set  u = (k  ,0)  and  u = (0,k  ) . 

xx  y y 

Figure  4.3  shows  two  simple  images,  and  Figure  4.4  shows  the  boun- 
daries obtained  by  using  our  algorithm. 

4.3  Boundary  Smoothing 

Our  boundary  following  algorithm  is  very  fast  in  computing  time. 
But,  the  results  in  Figure  4.4  show  that  further  processing  is  necessary 
to  smooth  out  the  zig-zag's,  because  we  need  more  accurate  angle  infor- 
mation for  later  processing.  In  other  words,  we  need  to  approximate  the 
boundary  better.  Pavlidis  and  Horowitz  C51D,  and  Ramer  C5 23  studied  the 
algorithms  for  approximating  boundaries.  These  methods  are  not  suitable 
in  our  processing  because  of  their  computational  cost.  We  do  not  like 
to  spend  much  time  in  the  preprocessing  stage.  Besides,  we  hope  to  keep 
the  sharp  corners  which  are  usually  meaningful  for  recognition  but  some- 
times smoothed  out  in  approximation. 

The  output  from  our  boundary  following  algorithm  is  a string  of 
unit  vectors.  The  input  to  our  parsing  algorithms  discussed  in  Section 
3.6  is  a string  of  vectors  which  approximates  the  true  shape  more  accu- 
rately. Therefore,  we  developed  smoothing  algorithms  which  use  the 
string  of  unit  vectors  as  input  and  produce  a string  of  longer  vectors 
as  output.  This  mechanism  can  be  described  as  a translation.  A machine 





* 
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which  performs  the  translation  can  be  called  a transducer  C43D.  We  used 
a translation  which  can  be  formulated  as  a pushdown  transducer  or  an  at- 
tributed finite  transducer  C44].  For  simplicity,  we  will  explain  the 
attributed  finite  transducer. 

In  the  following  definition,  qj's  are  attributed  states.  Each 
state  qj  represents  a subchain  sj  which  is  accepted  but  not  translated, 

and  which  is  described  by  the  associated  attributes.  tv  denotes  a 

series  of  unit  vector  v,  or t times  v.  -v  is  the  negative  of  v,  i.e., 
-v  and  v have  the  same  length  but  opposite  directions.  A A B denotes 
that  B follows  A.  6 is  a mapping  from  Q x I,  under  condition  C,  to  fin- 
ite subsets  of  Q x 0*.  The  mapping  performs  when  condition  C is  true. 
For  each  state  transition,  there  is  a set  of  attribute  rules.  The  at- 
tributes of  a state  may  be  unit  vectors  or  numbers.  In  the  transition 

rules,  i,  m,  n,  v,  u are  unit  vectors  and  p,  q,  l , k are  numbers.  The 

attribute  conditions  for  a transition  are  described  above  the  right  ar- 
row. For  each  transition,  the  input  unit  vector  is  compared  with  the 
vector  attributes  and  the  attribute  conditions  are  checked.  Then  the 
machine  goes  to  the  next  state  with  the  appropriate  output  and  transfers 
the  attributes  according  to  the  attribute  rules.  Each  expression  of  the 
output  is  a vector e 0. 


l' 
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Definition  4.1:  Attributed  Finite  Transducer  A 
A is  a 6-tuple,  (Q,  I,  0,  «,  S,  F) . 

I = the  input  set  consisting  of  4 unit  vectors  and  an  end  marker  $ 

<(1,0>,(0,-1>,(-1,0>,(0,1>,$> 

0 = the  output  set,  t<n1,n2)|(ni  = 0,0^  t 0) 
or  (n^  = +1,  n,_.  = any  integer),  i = 1,2> 

Q = a set  of  states  with  attributes,  -Cqj  | j = 0,...,9> 


q0: 


qV 

q2l,v: 


q3  : 
v,u 


q4 


t,v,u 


sO  = A,  empty 

si  has  one  unit  vector  v 

s2  = tv,  >.  , t > 1 

tv 

s3  = v A u,  u (or  _f  ) 
tv 

s4  = tv  A u,  ju  (or T ) , l > 1 


q5  : 
v,u 


s5  = 


= v A u A -v,  u (or  ) 


q6v,u: 


s6  = v A u A v,  ul_>  (or-T*  ) 


q7 


tv 


t,v,u,k ' 


q8 


v,t,u‘ 


q9 


t,v,uk  ’ 


s7  = tv  A u A kv,  r*  (or 


s8  = v A tu. 


kv 

(or 


),  t > 1 


tu 


tv 


s9  = tv  A u A kv  A -u,  I T (or  . 

kv 

t > 1,  k > 1 


J * >, 


S = the  initial  state  qO 


* 


r 
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i ♦ y,  k ♦ 

<q6>,n'->  * (q7l,v/i,k^)  v * m,  u - 


v ♦ m, 

<q7p,m,n,q'm)  * (q7l,v^J,k'^}  u «■  n. 


<q7p,*,n,q'n> 


P>2q 

♦ (q1v,(p-1)m/  2qm+n),  v ♦ n 


q/2<p<2q 

♦ (q1v,(p+q)m+n>,  v * n 


P<q/2 


t * q-p 


♦ (q4t,VA>'2pm+n)'  v * m 

u * n 

l ♦ P 
k ♦ q 

^q7p,m,n,q'  n)  * (q9i,v,u,k'^  v * m 

u ♦ n 


v «-  m 


<q8m,q,n'n)  * <q8v,t,u'*>'  u * n 

t * q+1 


I 


(q8m  „ _ ,m) 


q<4 


q>4 


l * q-1 


<q4l/v,u'm+n>  v * n 


u ♦ m 


t * q-2 


♦ Cq4  ,m+2n)  v ♦ n 

*•/ v /U 


u ♦ m 


Cq8_  _ „,-m) 


2 

n 

l ♦ p 
k ♦ q+1 
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* <q4l,v/U'm+n>  v * n 

u ♦ -rn 

q>4  1 * q'2 

* (9*.  ..  w/m+2n)  v ♦ n 

VU/V 


t * m+n+1 


t ♦ 2 


?p,m,n,q'-n)  * «*1/V,(p+q>m+n>,  v - -n 
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Figure  4.5  The  State-Transition  Diagram  of  Transducer  A 
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The  state  flow-chart  of  Transducer  A is  shown  in  Figure  4.5.  Be- 
cause Transducer  A is  deterministic,  the  computational  speed  is  very 
fast  and  varies  linearly  with  the  number  of  input  unit  vectors.  This 
algorithm  has  been  implemented  in  FORTRAN  on  the  POP  11/45.  The  compu- 
tation time  for  1000  unit  vectors  is  about  0.35  seconds.  For  illustra- 
tion, the  boundary  unit  vector  chains  of  Figure  4.4  were  used  as  input 

to  our  program  to  obtain  the  output  shown  in  Figure  4.6. 

The  boundaries  in  Figure  4.6  are  smoother  than  those  in  Figure  4.4 
and  all  the  sharp  corners  are  retained.  The  experiment  regarding  the 
noise  effect  on  descriptors  in  Section  3.5  was  based  on  the  output  of 
this  transducer.  Figure  4.6  shows  two  typical  shapes  used  in  that  ex- 
periment. However,  the  curves  can  be  smoothed  even  further  for  two  rea- 
sons: (1)  As  far  as  parsing  efficiency  is  concerned,  we  would  like  to 

reduce  the  number  of  vectors  in  the  boundary  chain  because  the  parsing 

time  increases  rapidly  with  the  number  of  vectors,  and  (2)  the  smoother 
the  boundary  is,  the  less  ambiguous  are  the  primitive  extractions  if  the 
corners  are  retained.  To  avoid  increasing  the  complexity  of  our  trans- 
ducer, we  used  a heuristic  algorithm,  which  is  based  on  the  connection 
angles  and  lengths  of  the  vectors,  to  further  smooth  the  boundary 
without  introducing  too  much  distortion. 

Due  to  the  digitization  grid,  few  straight  lines  stay  straight 
after  digitization,  because  they  just  fit  the  scanning  grid  [71,723. 
Let  us  consider  the  output  chain  of  Transducer  A.  If  there  is  no  noise, 
a straight  line  could  be  coded  as  a chain  of  small  vectors  with  angle 
changes  up  to  r^  * 0.06b.  See  Figure  4.7.  When  noise  is  present,  the 
situation  becomes  much  more  complicated.  If  we  assume  that  the  noise 


T 


' 


- 103  - 

does  not  move  the  false  boundary  more  than  one  pixel  away  from  the  true 
boundary,  and  the  probability  of  the  noise  occurrence  is  not  very  large, 
we  can  consider  one  pixel  at  a time.  Figure  4.8  shows  that  r^  ■ 0.25*. 

In  fact,  the  noisy  pattern  is  much  more  complicated  than  expected. 
We  use  the  following  heuristic  algorithm  to  detect  the  straight  lines  in 
the  boundary.  The  algorithm  has  three  parameters,  A^,  A^,  and  H.  A^, 
A^,  are  angle  distortion  maxima  and  H is  a vertical  distortion  maximum. 

A vector  subchain  is  approximated  by  a vector  if  the  following  four 
requirements  are  satisfied. 

1.  Total  angular  change  is  < A^ . 

2.  No  angular  change  within  the  subchain  is  > A^. 

3.  No  two  consecutive  angular  changes  are  > A^,  or  < -A^. 

4.  The  distance  from  any  point  of  the  subchain  to  the  approx- 
imated vector  is  < H. 

A^,  A^,  and  H can  be  any  positive  number.  We  used  = r^  = 0.06*, 
A^  = 2r^  = 0.5*,  and  H = \/7.  \/7  is  the  diagonal  of  a pixel,  r^  and 

r^  were  obtained  from  previous  noise  analysis.  We  used  = 2r^  instead 
of  r^  because  sometimes  the  boundary  ia  noisier  than  one  pixel  at  a 
time.  Although  there  might  be  better  choices  for  A^,  A^,  and  H,  these 
three  numbers  gave  us  reasonably  good  results. 

Noise  will  make  a short  smooth  curve  a concavity  or  a convexity. 
This  situation  may  not  be  detected  because  there  may  be  only  2 vectors 
in  the  subchain  and  the  first  requirement  is  not  satisfied.  So,  another 
simpler  subroutine  can  be  used  to  detect  and  reduce  small  concavities  or 
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convexities  by  checking  only  the  second  and  fourth  requirements.  But 
the  small  concavities/convexities  can  be  smoothed  out  only  when  they  oc- 
cur in  the  convex/concave  portion  of  the  boundary. 

After  the  boundary  vector  chain  is  smoothed,  a simple  procedure  is 
performed  to  locally  find  a sharp  convex  angle  to  be  the  starting  point. 
The  whole  boundary  smoothing  algorithm  is  summarized  as  follows. 
Algorithm  4.3;  The  Smoothing  Algorithm 
Input:  A vector  chain  from  Transducer  A. 

Output:  A smoothed  vector  chain. 

(1)  Set  A^,  A^,  and  H. 

(2)  Compute  length  of  each  vector  and  angle  between  vectors. 

(3)  Combine  consecutive  vectors  whose  connection  angles  are 
zero. 

(4)  Approximate  sub-vector-chains  with  longer  vectors  by  using 
the  four  requirements. 

(5)  Reduce  small  concavities  and  convexities. 

(6)  Find  a local  sharp  convex  point  to  be  the  starting  point, 
and  shift  the  chain  with  respect  to  the  starting  point. 

(7)  Terminate. 

We  applied  this  algorithm  to  the  shapes  in  Figure  4.6  and  obtained 
the  results  shown  in  Figure  4.9. 
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^.4  Recognition  Functions 

As  we  can  see  from  Figure  4.9,  the  boundary  chains  following  the 
smoothing  processes  are  rather  smooth  and  accurate,  although  they  are 
not  exact.  As  discussed  in  Chapter  3,  the  primitive  extraction  can  be 
done  during  the  parsing,  if  proper  recognition  functions  are  used.  The 
recognition  functions  can  be  different  with  different  primitives  and 
nonterminals.  In  our  implementation,  we  assumed  each  primitive  or  non- 
terminal has  a reference  point  in  the  transformed  space.  We  used  the 
same  recognition  function  for  all  the  primitives  for  simplicity  and  a 
slightly  different  function  for  all  the  nonterminals.  These  functions 
are  derived  from  the  study  of  noise  effect  on  normalized  descriptors. 
(See  Section  3.5) . 

A C-descriptor  (C,L,A,S)  of  V = v^  ...  v^  can  be  normalized  by  N. 
N:  (C,L,A,S)  ♦ (C/L,  A/2x , S/AL).  Since  the  three  variables  are  sup- 
posed to  be  independent,  they  are  treated  independently.  The  recogni- 
tion of  V as  a curve  segment  P,  whose  normalized  descriptor  is 

(C  /L  ,A  / 2tt  , S /A  L ),  can  be  achieved  by  the  followino  function, 
pp'p'ppp' 


r, 


R(P,V)  = I 


iyLp  - c/ui  < t, 

I A /2w  - A/2h  | < t2 

lSp' »pLp  ' S/#U  - *3 

a"d  |Lp,Lop  - L/L.1  ± \ 


^0  otherwise 

where  Lp/Lop  and  L/LQ  are  the  sizes  of  the  curve  segment  proportional  to 
the  whole  boundary,  t's  are  threshold  values. 
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Further  investigation  of  this  function  disclosed  that  R is  closer 
to  human  perception  if  t^  is  proportional  to  Cp/Lp.  For  an  open  curve 
of  constant  length  L,  the  proportional  difference  in  width  of  the  open- 
ing makes  more  sense  in  discrimination  than  the  absolute  difference. 

Therefore,  the  function  works  better  if  t,  = k.  (C  /L  ) . The  same  rea- 

1 1 P P 

soning  applies  to  the  fourth  inequality,  so  t.  = k, (L  /L  ).  Since  A 

4 4 p o p 

can  be  zero,  we  modify  the  fourth  inequality  to  avoid  the  possibility  of 
zero  in  the  denominator  of  S/AL. 


s 

_p  _ 
Lp 

s 

L 

£ t^a , where  a = 

if  |Ap|  > k 
if  | Ap|  < k 


a 

a 


and  k is  a positive  number, 
a 

When  Sp  and  S have  opposite  signs,  the  corresponding  P and  V de- 
cline to  opposite  directions.  Percept ional l y,  they  decline  differently 
though  they  may  be  little  different  in  value.  So,  we  may  use  a tighter 
threshold,  when  Sp  and  S have  different  signs.  Thus,  we  modify  t^  by 
setting 


*3 


0.5  k,  if  S • S < 0 
3 p — 

kj  if  otherwise. 


As  mentioned  before,  S somehow  measures  the  degree  of  declination  of  a 
simple  curve  segment.  When  the  curve  is  complex,  the  declination  be- 
comes less  significant.  Besides,  the  experiments  show  that  S is  very 
sensitive  to  noise  occurring  close  to  the  ends  of  the  curve  segment, 
when  the  curve  segment  is  complex.  Therefore,  Sis  not  used  in  recog- 
nizing nonterminals.  The  recognition  function  for  curve  segments  is  as- 


sumed  as  follows 


Assumption  4.1 : 


V = V1  ”•  vm  ' P'  iff  R(p'v)  = 1/  ^ere 


D(p>  = (t,  Lp,  Ap,  Sp>,  D(V)  = Ct,  L,  A,  S) 
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otherwise 
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where  k 2 = 2*  t?, 

!1  if  P is  a curve  primitive 
0 if  P is  a nonterminal 
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if  Sp  • S > 0 
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P — 

| A | 

' P1 

if 

| A | > k 

1 p'  a 

k 
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if 

| A | < k 

1 p1  — a 

>nd  k^,  k^,  k,,  k,,  and  k are  positive  constants 


This  recognition  function  is  developed  from  the  noise  analysis  in 

Section  3.5.  The  reasons  for  which  we  developed  such  a function  are 

that  (1)  it  is  computationally  simple  and  (2)  we  do  not  have  a large 
number  of  samples  to  obtain  a complete  noise  analysis.  Basically,  each 
curve  primitive  or  nonterminal  is  given  a recognition  region  in  the 
transformed  descriptor  space.  If  the  transformed  description  of  V falls 
in  the  recognition  region  of  P,  V is  recc  ?n'zed  as  P.  Otherwise,  V is 
not  recognized  as  P.  The  above  k's  specify  the  size  of  the  recognition 
region.  The  greater  the  k's  are,  the  bigger  the  region  is.  The  extreme 

cases  are  (1)  if  k's  are  zero's,  only  the  curve  segments  which  have  ex- 

actly the  same  transformed  descriptor  as  P can  be  recognized,  and  (2)  if 
k's  are  very  large  numbers,  all  possible  curve  segments  can  be  recog- 
nized as  P.  Therefore,  we  want  to  choose  k's  which  are  large  enough  to 
recognize  the  noisy  curve  segments  but  small  enough  so  that  the  wrong 
curve  segments  will  not  be  recognized.  Fortunately,  the  production 
rules  can  be  used  to  eliminate  the  wrong  curve  segments  whose 
transformed  descriptors  fall  in  the  recognition  region.  This  idea  was 
explained  in  the  PEE  parsing  in  Section  3.6.  Hence,  the  size  of  the 

recognition  region  is  not  very  critical,  as  long  as  it  is  big  enough  to 

\ 

cover  the  noisy  curve  segments.  But,  we  don't  like  to  use  a very  big 
recognition  region  even  if  the  production  rules  are  sufficient  to 
describe  the  structural  differences,  because  a bigger  recognition  region 
will  accept  a larger  number  of  candidates  to  the  parsing  table.  Though 
the  production  rules  may  eliminate  the  wrong  candidates,  the  processing 
time  will  be  longer.  Therefore,  we  like  to  find  the  recognition  region 
which  is  barely  big  enough  to  cover  the  noisy  cases  to  minimize  the  com- 


puting  time.  From  the  noise  analysis  of  Section  3.5,  we  have  an  idea  of 
the  distribution  of  the  normalized  descriptors.  Here,  we  simply  choose 
the  k's  whose  corresponding  recognition  region  is  about  two  to  three 
times  as  wide  as  the  distribution  in  each  dimension  of  the  transformed 
space.  The  numbers  chosen  for  our  experiments  are  0.3  0.45, 

0.2  x a < k2  < 0.3  x a , 0.4  < k3  < 0.6,  0.2  _<  k^  < 0.3,  kg  = 0.4.  We 
used  smaller  k's  to  recognize  shapes  which  are  less  noisy  or  to  discrim- 
inate classes  which  are  different  only  in  small  details,  and  greater  k's 
to  recognize  noisier  shapes  or  to  discriminate  classes  which  are  very 
different  in  structure.  The  variation  of  k's  is  somewhat  determined  by 
the  relationship  between  the  noise  and  the  recognition  region. 

The  function  for  recognizing  angle  primitives  is  suggested  as  fol- 
lows. 

Assumption  ±-2} 

V = v.j  ...  v^  - A,  iff  RCA, V)  = 1,  where  V has  angle  length,  t, 

from  v_  to  v .,  and  total  angular  change,  a,  from  v,  to  v , and 
c 1 m 

II  if  t < kc 

|A  - a|  < k$ 

0 otherwise 

kc  is  the  corner  tolerance  mentioned  in  Chapter  3 and  k,.  is  a posi- 
tive constant. 

kj  = r2  = 0.25i  is  obtained  from  the  noise  analysis  in  Section  4.4. 
An  angle  primitive  is  supposed  to  have  no  length.  But  it  may  be 
smoothed  to  cover  a few  pixels  when  noise  is  present.  The  corner  toler- 
ance kc  allows  an  angle  to  cover  a length  up  to  kc.  Of  course,  kc 
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Experiment  4.1: 


Figure  4.10  shows  two  arbitrary  angle  views  of  airplane  models  652 
and  F102.  They  are  very  much  different  in  structure.  We  constructed 
grammars  Gg^  A and  Gp^  A interactively  to  describe  them  respectively. 
The  corresponding  segmentations  are  illustrated  in  Figures  4.11  and 
4.12.  The  attribute  rules  in  the  following  grammars  are  omitted  for 
simpl icity. 

Eleven  shapes  with  respect  to  various  rotations  of  Figure  4.10(a) 
and  nine  shapes  with  respect  to  various  rotations  of  Figure  4.10(b)  were 
obtained  through  all  the  procedures  described  in  Sections  4.2  and  4.3. 
They  can  be  found  in  Appendix  B.  These  shapes  were  processed  by  the  PEE 
Earley's  parser  with  the  above  two  grammars,  Gg,.^  A and  Gp  102  A*  1,16 
tests,  20  shapes  for  each  of  the  2 grammars,  were  all  correct.  Since 
some  B52's  are  quite  noisy,  we  use  a bigger  recognition  region.  We  used 

k.  = 0.45,  k_  = 0.3  x 3 , k,  = 0.6,  k.  = 0.3  in  this  experiment.  The 

1 d 5 H 

average  processing  time  per  test  is  0.19  seconds  for  a chain  of  30  vec- 
tors and  0.29  seconds  for  a chain  of  36  vectors.  We  then  converted  our 
grammars  to  a finite-state  form  by  an  algorithm,  which  will  be  discussed 
in  Chapter  6.  And  we  processed  these  shapes  by  the  PEE  finite  automaton 
with  two  converted  finite-state  grammars,  Fg,^  A and  Fp,^  a*  ^0 

tests  were  also  all  correct.  The  average  processing  time  per  test  is 
0.024  seconds  for  a chain  of  30  vectors  and  0.03  seconds  for  a chain  of 
36  vectors.  \ 

The  context-free  shape  grammars  (CFSG)  Gg,^  A and  Gp.^  A and  the 

finite-state  shape  grammars  (FSSG)  Fg,.^  A and  Fg^  A are  listed  in  the 

following  pages.  As  mentioned  in  Chapter  3,  the  primitives  used  to 
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describe  a shape  can  be  flexibly  assigned,  no  matter  whether  the  grammar 
is  a CFS6  or  a FSSG.  For  the  convenience  of  comparison,  the  CFSG  and 
the  converted  FSSG  discussed  here  have  the  same  primitive  set.  The  de- 
finition of  an  ordinary  finite-state  grammar  allows  one  primitive  in  a 
production,  i.e.,  the  production  is  either  or  N^-*  p^,  where 
N.j  and  are  nonterminals  and  p^,  p^  are  primitives.  Our  FSSG's  can  be 
of  this  ordinary  form.  In  this  case,  p^  may  be  an  angle  or  a curve 
primitive,  and  p^  is  an  angle  primitive.  Since  an  angle  primitive  al- 
ways follows  a curve  segment  in  our  shape  descriptions  (see  Section 
3.4),  we  allow  one  curve  primitive  and  one  angle  primitive  in  each  pro- 
duction to  reduce  the  number  of  productions  in  half.  That  is,  our 
FSSG's  have  the  productions  in  form  of  * caN^  or  ♦ ca,  where  "c" 
is  a curve  primitive  and  "a"  an  angle  primitive.  The  attribute  rules  in 
following  grammars  are  omitted  for  simplicity.  As  a matter  of  fact,  the 
attribute  rules  for  N-production  rules  in  a FSSG  are  not  necessary,  be- 
cause the  recognition  of  nonterminals  in  finite  automaton  is  not  per- 
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Experiment  ^.2: 

Figure  4.13  shows  three  views  of  two  airplane  model s,  F86  and 
MIG-15.  They  are  very  similar  in  structure.  T,  V,  and  U represent  dif- 
ferent angle  views.  They  are  alt  close  to  the  top  view.  (a)  differs 
from  (b)  and  (c)  by  two  small  missile-tails  and  a machine  gun  on  the 
right  wing.  But  the  machine  gun  may  not  appear  in  the  digitized  pic- 
ture. We  can  construct  a grammar,  Gpg^  T,  to  distinguish  it  from 
MIG-1 5' s.  The  corresponding  segmentation  can  be  found  in  Figure  4.14. 
As  we  explained  in  Section  3.7,  the  shape  of  F86,T  may  be  recognized  by 
grammars  of  MIG-15,  but  the  shapes  of  MIG-15  cannot  be  recognized  by 
Gpg^  j because  MIG-15  shapes  do  not  have  missile-tails. 

MIG-15, V and  MIG-15, U are  different  mainly  in  the  width  of  the 
fuselage  close  to  the  tail.  The  whole  tail  may  not  be  designated  as  a 
primitive,  since  it  contains  two  tail  ends  which  may  be  the  starting 
points.  We  need  to  check  the  nonterminal  which  represents  the  whole 
tail  or  the  whole  shape  excluding  the  tail.  A grammar  ^ is  con- 

structed to  distinguish  the  two  angle  views.  The  corresponding  segmen- 
tation can  be  found  in  Figure  4.15. 

In  this  experiment,  the  recognition  of  these  three  angle  views 
forms  a simple  classification  tree  shewn  in  Figure  4.16.  Through  all 
the  procedures  described  in  Sections  4.2,  4.3,  we  obtained  5 F86,T's,  5 

MIG-15, V's,  and  3 MIG-15, U's  with  respect  to  arbitrary  rotations.  They 
can  be  found  in  Appendix  C.  At  the  first  stage,  the  unknown  shapes  can 
be  processed  by  the  PEE  Earley's  parser  with  Gpgg  j c by  the  PEE  finite 
automaton  with  Fpg^  y,  which  is  the  converted  finite-state  shape  gram- 
mar. Either  parsing  algorithm  can  achieve  completely  correct  recogni- 
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tion  because  the  recognition  of  primitives  is  sufficient  to  distinguish 
F86,T  from  MIG-15.  The  recognition  functions  described  in  Assumptions 
4.1  and  4.2  with  smaller  constants,  k's  were  used.  k^  = 0.3, 
k^  = 0.2  x 2 , kj  = 0.4,  and  k^  = 0.2.  At  the  second  stage,  the  shapes 
can  only  be  processed  by  the  PEE  Earley's  parser  with  G^^g  y*  because 
we  need  to  check  the  nonterminals  which  cannot  be  recognized  in  the  PEE 
finite  automaton.  All  13  shapes  were  correctly  recognized  via  the  clas- 
sification tree. 

The  CFSG's,  Gc0,  _ and  G„„  are  listed  in  following  pages  and 

rou/ I nlb“l U 

the  converted  FSSG,  Fpg6  j,  can  be  found  in  Appendix  D.  There  are  aver- 
agely 60  vectors  in  each  boundary  chain.  The  average  classification 
time  per  test  at  the  first  stage  is  0.21  seconds  for  the  PEE  Earley's 
parser  and  0.04  seconds  for  the  PEE  finite  automaton.  At  the  second 
stage,  the  average  classification  time  per  test  is  0.24  seconds  using 
the  PEE  Earley's  parser. 
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The  previous  experiments  demonstrated  that  our  proposed  method  can 
discriminate  shapes  by  their  differences  in  global  structure,  boundary 
details,  or  parts  of  shapes.  They  are  described  in  terms  of  production 
rules,  primitives,  and  nonterminals  respectively.  The  number  of  testing 
patterns,  33,  is  not  large  enough  to  claim  one  hundred  percent  accuracy, 
although  these  33  are  all  correctly  recognized.  However,  this  perfor- 
mance appears  to  be  quite  good  to  demonstrate  the  recognition  capability 
of  our  proposed  syntactic  method.  In  addition,  this  syntactic  method 
with  error-correcting  techniques  can  recognize  partially  distorted 
shapes.  The  distorted  shape  recognition  will  be  discussed  in  Chapter  5. 

Although  the  results  of  these  experiments  seem  sat i sfactory,  furth- 
er discussion  on  the  following  four  questions  are  necessary:  (1)  What  is 
the  (imitation  of  the  recognition  power?,  (2)  How  do  we  apply  this 
method  to  recognize  3-dimensional  objects?,  (3)  Can  the  grammars  be  in- 
ferred automatically?,  and  (4)  How  do  we  convert  a context-free  shape 
grammar  to  a finite-state  shape  grammar,  if  only  a finite-state  shape 
grammar  is  necessary  for  recognition? 

The  latter  two  questions  will  be  discussed  in  Chapter  6.  Using 
this  method  to  recognize  a 3-dimensional  object,  we  must  realize  that 
there  are  often  many  different  angle  views  for  a rigid  object.  Each 
view  can  be  described  by  a subgrammar,  whose  S symbol  has  a label  indi- 
cating the  viewing  angle.  Since  the  subgrammars  are  independent,  they 
can  be  parallelly  processed,  if  a mul tiproces.or  system  is  used.  Other- 
wise, all  these  subgrammars  can  be  put  together  to  form  a single  gram- 
mar. Some  of  the  subgrammars  have  very  similar  primitives.  For  in- 
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stance,  some  views  have  similar  nose  and  wings.  The  combined  grammar 
can  be  simplified  by  combining  the  similar  primitives,  the  nonterminals 
and  the  same  production  rules.  The  simplification  will  reduce  the 

storage  space  required  for  grammar  and  speed  up  the  processing. 

The  second  experiment  in  Section  4.6  shows  that  this  method  can 

distinguish  shapes  with  small  differences.  Theoretically,  any  small 
difference  which  can  be  described  by  descriptors  can  be  used  for 
discrimination.  But  the  practical  results  show  that  this  may  not  be 
true  due  to  the  noise,  resolution,  and  so  on.  We  have  tried  to  recog- 
nize an  angle  view,  MIG-15, T,  (see  Figure  4.17)  which  is  between 
MIG-15, V and  MIG-15, U.  Three  MIG-15, T's  were  processed  through  the 
classification  tree  in  Figure  4.16.  They  were  all  classified  as  MIG-15, V 
since  they  are  very  close  to  MIG-15, V.  Then  we  tried  to  distinguish 

MIG-15, V's  and  MIG-15, T’s.  It  did  not  turn  out  well.  Some  of  the 


MIG-15, T shapes  were  misrecogni zed  as  MIG-15, V’s,  and  vice  versa.  The  3 
MIG-15, T's  can  be  found  in  Appendix  C.  With  the  limitations  of  our 
resolution  and  boundary  smoothing,  if  we  reduce  the  size  of  the  recogni- 
tion region,  namely  tighten  the  threshold  k's  in  the  recognition  func- 
tions, the  noisy  patterns  will  not  be  recognized.  If  we  do  not  reduce 
the  size  of  the  recognition  region,  the  recognition  functions  together 
with  the  production  rules  are  not  discriminative  enough  to  distinguish 
all  the  testing  patterns  correctly.  To  improve  the  recognition  perfor- 
mance, we  need  to  improve  digitization  resolution,  boundary  approxima- 
tion, and  recognition  functions. 

t I k 
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CHAPTER  5 

l 

DISTORTED  SHAPE  RECOGNITION 

Introduction 

The  syntactic  shape  recognition  method  using  attributed  grammars 
was  proposed  and  discussed  in  Chapter  3.  The  implementation  of  this 
method  on  airplane  shapes  in  Chapter  4 demonstrated  that  this  method  is 
an  effective  shape  recognition  method.  But  the  shapes  we  have  dealt 
with  so  far  were  reasonably  clear  in  the  sense  that  only  a small  amount 
of  random  noise  was  present.  To  show  the  advantage  of  our  method  over 
other  existing  methods,  we  will  discuss  in  this  chapter  the  generalized 
version  of  our  method  and  the  utilization  of  error-correcting  techniques 
to  recognize  some  heavily  distorted  shapes. 

As  mentioned  previously,  the  shapes  can  be  distorted  by  two  dif- 
ferent types  of  noise.  One  type  of  noise  is  random  in  nature  and  will 
be  referred  to  as  random  noise.  The  other  type  of  noise  is  not  random 
and  will  be  referred  to  as  nonrandom  distortion.  The  random  noise  can 
be  filtered  to  a certain  extent  by  some  digital  filtering  techniques 
C22,90D,  while  the  nonrandom  distortion  is  difficult  to  remove  without 
knowing  the  classification  of  the  shape.  Normally,  random  noise  is 
inevitable  and  nonrandom  distortion  may  often  occur.  For  example,  a 
real  airplane  picture  with  a clear  background  has  a small  amount  of  ran- 
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dom  noise,  and  a picture  of  an  airplane  in  a cloudy  background  may  have 
random  noise  and  nonrandom  distortion,  when  the  clouds  cover  some  por- 
tions of  the  airplane.  In  the  real  world,  a shape  is  often  distorted 
with  both  types  of  noise.  The  conventional  filtering  techniques  or 
recognition  methods  have  handled  the  random  noise,  but  still  cannot  han- 
dle the  nonrandom  distortion  well. 

In  Sections  5.2  and  5.3,  the  recognition  functions  and  the  PEE 
parsing  schemes  will  be  generalized  to  recognize  noisy  shapes.  An 
error-correcting  technique  will  be  introduced  to  the  parsing  scheme  in 
Section  5.4  to  recognize  heavily  noisy  or  distorted  shapes.  Several 
versions  of  Earley's  algorithm  with  different  capabilities  will  be  dis- 
cussed in  Section  5.5  to  show  the  feasibility  of  these  ideas.  The 
Earley's  algorithm  for  context-free  grammar  is  selected  for  modification 
and  discussion  mainly  because  the  context-free  shape  grammar  can 
describe  portions  of  the  shape  which  are  semantically  meaningful  by  non- 
terminals with  attributes. 

Some  distorted  shapes  will  be  processed  by  our  error-correcting  PEE 
Earley's  parser  to  show  its  recognition  capability.  Then,  a discussion 
concludes  this  chapter. 

5^.2  General  i zed  Recognition  Functions 

We  have  assumed  that  the  recognition  functions  are  0-1  functions  in 
Section  4.5.  (See  Assumptions  4.1  and  4.2).  The  result  of  such  a 
recognition  function  is  either  "accept"  or  "reject".  If  the  transformed 
descriptor  of  an  unknown  curve  segment  is  within  a certain  neighborhood 
of  the  transformed  descriptor  of  a referenced  curve  segment,  the  unknown 
curve  segment  is  recognized  as  the  referenced  curve  segment.  Otherwise, 
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the  recognition  fails.  The  recognition  function  for  angle  primitives  is 
similar.  As  a matter  of  fact,  the  boundary  between  "accept"  and  "re- 
ject" is  not  clear.  Tsai  and  Fu  [733  proposed  a deformational  model. 
In  this  model,  every  noise  pattern  is  regarded  as  transformed  from  a 
pure  pattern  through  syntactic  deformation  and  semantic  deformation. 
The  syntactic  and  semantic  deformations  are  symbolic  error  and  attribute 
deviations  respectively.  This  deformational  model  is  applied  to  primi- 
tive extractions.  Each  deformation  is  assigned  a probability  and  each 
class  has  a priori  probability.  A Bayes  recognition  rule  is  hence  ob- 
tained based  upon  these  assigned  probabilities. 

The  probabilistic  model  is  not  practical  in  our  case  because  it  is 
difficult  to  collect  a large  enough  number  of  samples  to  obtain  a mean- 
ingful probability  function  for  each  referenced  primitive  or  nontermi- 
nal. Besides,  Tsai's  deformational  model  cannot  be  applied  to  nontermi- 
nal recognition  adequately. 

For  simplicity,  we  attempt  to  generalize  our  0-1  recognition  func- 
tions to  take  care  of  the  so-called  semantic  deformation,  or  attribute 
deviation,  and  to  use  error-correcting  techniques  to  solve  the  syntactic 
deformation.  For  a generalized  recognition  function,  we  use  the  idea  of 
membership  function  from  fuzzy  set  theory  [74].  The  membership  function 
for  a primitive  maps  from  the  domain  constituted  by  the  primitive  and 
the  unknown  vector  chain  to  the  range  of  [0,13  on  the  real  line. 
Definition  5^.1^: 

Let  P and  V be  a primitive  and  a vector  chain  to  recognize.  The 
recognition  function  R maps  from  P x V to  CO, 13.  That  is,  R(P,V)  = r, 
r e [0,13. 
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If  r = 1,  the  V is  recognized  as  P without  a doubt.  If  r = 0,  the 
recognition  fails.  If  0 < r < 1,  the  recognition  is  partially  success- 
ful. From  another  point  of  view,  r can  be  used  to  reflect  the  similari- 
ty between  P and  V.  If  the  recognition  is  based  on  the  transformed 
descriptors,  we  may  define  the  recognition  function  as  follows. 
Definition  5^2: 

R(P,V)  = R (T(P) , T(V))  = r e CO, ID,  where  T(»)  denotes  the 

transformed  descriptor. 

Definition  5.1  can  be  extended  to  the  case  of  a primitive  string. 
Definition  5^3: 

Let  a = P p ...P,  be  a string  of  k primitives  and  V = v.v0...v  be 
I c.  k i c m 

a vector  chain  to  recognize.  There  are  n feasible  noisy  boundary  seg- 
mentations, or  NBS's,  each  of  which  segments  V into  k vector  subchains. 

be  a set  of  vector  subchains  with  respect  to  one  of  n ICS's.  = 
^si 1/Si 2' ** *'sik^  si's  are  vector  subchains.  The  string  recognition 

function  R of  a and  * . is 
s Ti 

Rs(a'4'i)  = Rs(P1'si1'P2'si2'”''Pk'sik) 

= r.  e [0,1] 

And  the  recognition  function  of  a and  V is 

R (a,V)  = Max  R (a,$.)  = Max  r.  = r 
1<i<n  1 1<i<n  1 

If  n = 0,  then  R (a,V)  = r = 0. 

s 

The  computation  of  r..  from  P's  and  s^'s  is  an  interesting  topic  of 
study.  If  we  consider  that  each  P^  and  its  corresponding  s^  are  relat- 
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ed  by  a fuzzy  quantity,  then  the  relation  between  and 

si1si2***sik  can  considered  as  a string  of  concatenated  fuzzy  quanti- 
ties. Lee  and  Zadeh  [82]  proposed  fuzzy  languages  and  fuzzy  grammars. 
Also,  they  defined  the  operations  of  union,  intersection,  and  concatena- 
tion of  fuzzy  languages.  Of  course,  the  concatenation  of  strings  was 
also  defined.  Recently,  Lakshmivarahan  and  Raj asethupathy  [83]  followed 
the  same  definitions  in  studying  the  fuzzification  of  formal  languages 
and  synthesis  of  fuzzy  grammars.  However,  our  attributed  shape  grammars 
are  not  fuzzy  grammars.  [82]  and  [83]  fuzzify  the  grammars  by  giving 
each  production  rule  a fuzzy  membership,  while  our  shape  grammars  do  not 
have  fuzzy  membership  for  the  production  rules.  Besides,  the  fuzzy 
quantity  obtained  through  the  concatenation  of  fuzzy  quantities  defined 
by  [82]  does  not  seem  correct  semantically  for  our  shape  recognition. 
According  to  the  definition  in  [82],  the  membership  of  equals  the 

minimum  of  the  membership  of  P^  and  P^.  But  in  our  shape  recognition, 
if  P.j  is  perfectly  recognized  and  P^  is  poorly  recognized,  we  would  like 
to  get  a membership  value  somewhere  between  those  of  P^  and  P^.  A very 
intuitive  idea  is  to  take  the  weighted  average  of  the  membership  of  P^ 
and  P-,  as  the  membership  of  P^-  This  weighted  average  idea  has  been 
suggested  by  Bellman  and  Zadeh  [89],  The  wrighting  coefficients  reflect 
the  relative  importance  of  the  corresponding  primitives  or  nonterminal s. 

The  addition  and  multiplication  of  fuzzy  quantities  have  been  stu- 
died and  published  [80,81].  We  may  use  fuzzy  additions  and  fuzzy  multi- 
plications to  obtain  a good  recognition  function  for  a string  of  primi- 
tives. But,  for  simplicity  in  computation,  we  used  the  idea  of  weighted 
average  and  obtained  semantically  reasonable  membership  values.  From 
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another  point  of  view,  we  assume  there  is  a recognition  hierarchy,  i.e., 
the  membership  value  of  a string  of  primitives  can  be  computed  based  on 
the  membership  values  of  the  primitives  and  the  weights  associated  with 
the  primitives.  Taking  advantage  of  the  attributes  associated  with  each 
primitive,  we  may  use  the  ratio  of  the  curve  length  of  the  primitive  to 
the  curve  length  of  all  the  primitives  in  the  string  as  the  weight. 
Therefore,  if  of  length  and  of  length  are  recognized  with 


membership  values  r^  and  r^  respectively,  the  membership  value  of  P^^ 


*1  l2 

can  be  calculated  by  — - r,  + . r?.  So,  if  Pp  is  completely  re- 

l1 


l1+t2  1 V*2  2' 


jected,  i.e.,  r_  = 0,  we  still  have  a value  of  r. . in  such  a 

£ Vl2 


case,  if  P^  is  much  longer  than  Pp,  the  major  part  of  the  total  curve 
can  still  be  recognized  with  a reasonable  membership  value.  With  this 
hierarchical  assumption,  the  general  form  of  the  string  recognition 
function  can  be  written  as  follows. 

Definition  5.4 


= /S ^ ) , Pp/R .j  p)  •••,  *R ^k/^i 


= Rs(P1'ri1'P2'ri2'”*/Pk'rik)  = ri  e C0'13 


where  r..  = R(P,,s.,),  1 < j < k,  and 
i]  1 i]  ' — J — 


R (o,V)  = Max  R (a ,$  ■ ) = Max  r.  = r e [0,13 
1<i<n  s 1 1<i<n  1 


The  Definition  of  5.2  can  easily  be  plugged  into  Definition  5.4  to 
obtain  an  expression  based  on  transformed  descriptors.  In  the  following 
development,  the  hierarchical  relationship  is  assumed. 





1 • ; • 
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As  mentioned  previously,  the  recognition  of  nonterminals  plays  an 


important  role  in  our  recognition  system.  The  corresponding  recognition 
functions  need  to  be  generalized  too.  In  other  words,  we  would  like  to 
have  a generalized  recognition  function  which  can  reflect  the  similiari- 
ty  between  a nonterminal  and  a curve  segment.  Since  a nonterminal  can 
be  broken  into  primitives,  the  similarities  of  the  primitives  may  affect 
the  similiarity  of  the  nonterminal.  Therefore,  the  recognition  function 
of  a nonterminal  depends  upon  the  nonterminal  and  the  vector  chain  as 
well  as  the  primitives  and  their  corresponding  subchains. 

N + indicates  two  productions  N ♦ and  N ♦ o^-  Each  pro- 
duction rule  produces  a string  of  primitives  and  nonterminals.  In  the 
following  definition  of  a nonterminal  recognition  function,  we  use  to 
recognize  a string  of  primitives  and  nonterminals  instead  of  only  primi- 
tives. This  extension  is  omitted  here  for  simplicity. 

Definition  5.5: 


Let  ♦ « i -j  I “ i 2 ^ ***  l°ii'  and  v = v1v2*"*vm  be  3 vector  cha,*n  t0 
recognize.  The  nonterminal  recognition  function  with  respect  to 

I 

Ni  * aip'  ^ — P — l'  can  be  defined  as 


P,V>  = R<Ni,V,Rs(a.  ,V)> 


The  nonterminal  recognition  function  for  N wi  1 1 be 


R(N.,V)  = Max  R(N.  P,V)  = r e C0,1 D 
1 <p<l 


For  our  particular  interest  in  transformed  descriptors  we  may  de- 
fine the  nonterminal  recognition  function  as  follows. 
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Definition  ^*6: 

Let  ■*  I°i2l  •**lait'  anc*  v = v1v2*”vm*  7,160 

a • 

RUr’^V)  = R{T(N.),T(V),Rs(aip,V)> 

for  N.  ♦a.  , and 
i ip 

a . 

R(N . ,V)  = Max  R(N-1P,V)  = r c [0,1] 

1 1<P<1 

for  nonterminal  N. 

In  the  above  derivations,  R denotes  the  recognition  function  for  a 
primitive  or  a nonterminal,  while  R$  denotes  the  recognition  function 
for  a string  of  primitives  and  nonterminals.  The  recognition  functions 
are  defined  such  that,  in  the  general  case,  it  is  more  convenient  to 
recognize  the  primitives  before  the  nonterminals.  We  can  see  from  De- 
finition 5.5  that  we  need  to  recognize  the  strings,  a's,  (the  right  side 
of  the  production  rules)  in  recognizing  the  nonterminal  N.  This  situa- 
tion favors  bottom-up  parsing.  And,  it  can  fit  our  PEE  parsing  very 
well.  (See  Figure  3.16.) 

■Lrl  The  PEE  Parser  Using  General ized  Pecognition  Functions 
In  Section  3.4,  we  explained  the  concept  of  primitive-extraction- 
embedding. A PEE  parser  attempts  to  find  a derivation  tree  and  a 
corresponding  feasible  segmentation  of  the  input  vector  chain.  As  il- 
lustrated by  Example  3.6,  there  may  be  more  than  one  feasible  noisy 
boundary  segmentation  (NBS)  corresponding  to  a reierence  segmentation. 
With  0-1  recognition  functions,  these  feasible  NBS's  are  equally  recog- 
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nized  or  accepted  by  the  grammar.  With  generalized  recognition  func- 
tions,  the  membership  values  are  computed  along  uith  parsing  until  the 
starting  symbol  is  reached.  There  may  be  many  possible  membership  value 
between  the  noisy  boundary  chain  and  the  shape  grammar  with  respect  to 
different  NBS's  and  different  reference  segmentations.  We  could  select 
the  NBS  which  corresponds  to  the  highest  membership  value  as  the  best 
NBS  and  this  membership  value  reflects  the  closest  relationship  between 
the  vector  chain  and  the  shape  grammar.  Let  us  define  the  membership  of 
a vector  chain  with  respect  to  a shape  grammar  as  follows.  Let  the  in- 
put vector  chain  be  V = v^...vm.  V1  m+1  denotes  v1  V£. . •vn|vn|+1 , with 
vm+^  = v.| . In  the  following  discussion,  ■ denotes  a vector  subchain 

Vi+1*’*V 

Definition  5 ._7: 

Let  G = (V,  T,  P,  S)  be  a shape  grammar  and  V = v.v-,...v  be  an 

i c m 

unknown  vector  chain.  The  membership  of  V with  respect  to  G is 
s = R(G,V^  m+^) . If  P has  k S-production  rules. 


S * l<*2 1 • * * I a|< 


R(6/V1,m+1)  = Rs(ai'V1,m+1 


From  the  view  point  of  attributed  grammars,  we  may  consider  the 
membership  value  as  an  attribute  associated  with  each  primitive  and  non- 
terminal . With  this  idea,  the  general  form  of  the  attributed  shape 
grammar  in  Section  3.4  can  be  modified  as  follows. 


. T 
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Gl  = <Vl'Tl'Pl'Sl,r<S>) 

Vl  = {Sl,r(S)'Nr(N>,D<N)'s> 

Tt  = ^Fr(F),D(F) ,S/  Ar(A),D(A)'s> 

Pt:  St,r(S>  * a = <xA)*XA<Answerc^>; 

r (S)  * Recognition  function  of  D(X),  r(X),  D(  A) , rCA) 
for  X's  and  A's  in  a 
s «■  r(S) 
c ♦ l 


Nr (N) ,D(N) 
D(N) 

r(N) 


6 = <XA)*X 

(D(X)  © )*D(X) 

A 

Recognition  function  of  D(N) , D(X), 


r(X),  D(A),  r(A)  for  X's  and  A's  in  6 


The  r(X)  and  D(X)  denote  the  membership  value  and  the  descriptor  of 
X respectively.  And,  X c TN's  F's>. 

To  understand  clearly  how  the  membership  functions  are  computed  and 
how  the  production  rules  are  selected  in  the  generalized  PEE  parser,  we 
have  to  consider  the  input  vector  chain  and  its  subchains.  During  pars- 
ing, the  recognition  functions  are  performed  and  production  rules  are 
selected  according  to  the  following  rules. 

(a)  A S-production  rule 

Sl,r(S)  " a = X1A2X3‘**X2n+1A2n+2  {Answerc 
c {F's,N's>,  is  selected  and 


to 


* . * 
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r(S)  ♦ Rs(a,V1/B1+1> 

s * r(S) 
c * l 

if  and  only  if  the  following  conditions  are  satisfied. 

(1)  For  any  in  a,  there  is  a corresponding  subchain 
Vik,jk'  and  r(V  * R(Ak,Vik,jk> 

(2)  For  any  Xk  e {F's>  in  a,  there  is  a corresponding  sub- 
chain  VikJk,  and  r(Xk>  - R<Xk,Vik,jk> 

(3)  il  = 1,  jk  = i(k+1),  for  1 < k < 2n+1  and  j(2n+2)  = m+1 
(b)  A N-production  rule 

Nr(N),D(N>  * B = X1A2X3"’X2n+1'  Xk  E {F'S'N'S> 

is  selected  corresponding  to  a vector  subchain  V.  . and 

1 /) 

DCN)  «-  D(X. ) © OCX,)  ...  © OCX-  ..) 

i x 3 . , » . 2n+l 

DCA^)  DCA^) 


r(N)  * R(N0,V.  .) 

' /J 

if  and  only  if  the  following  conditions  are  satisfied 


(1)  For  any  Ak  in  B,  there  is  a corresponding  subchain 

vn,jk'  and  r<V  * R<*k'vik,j k> 

(2)  For  any  Xk  e <F's>  in  B,  there  is  a corresponding  sub- 
chain  V.kJk  and  r(Xk>  * R<Xk,ViMk> 
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(3)  il  = i,  jk  = i(k+1)  for  1 £ k £ 2n  and  j(2n+1)  = j 

In  both  (a)  and  (b>,  the  conditions  (1)  and  (2)  state  how  the  angle 
primitives  and  curve  primitives  are  extracted  from  the  noisy  boundary 
chain,  and  condition  (3)  guarantees  that  the  vector  subchains 
corresponding  to  the  A's,  F's,  and  N's  in  the  production  rules  selected 
are  feasible  or  consistent.  Therefore,  if  a S-production  rule  is 
selected,  then  (1)  there  is  a derivation  tree  which  derives  a syntacti- 
cally correct  symbolic  pattern  in  the  language  by  using  the  production 
rules,  (2)  the  derivation  tree  corresponds  to  a reference  segmentation 
of  a noise-free  shape,  and  (3)  there  exists  at  least  one  feasible  noisy 
boundary  segmentation  of  the  vector  chain  corresponding  to  the  reference 
segmentation.  (Refer  to  Section  3.6  for  the  meaning  of  reference  seg- 
mentation and  noisy  boundary  segmentation.) 

Since  each  recognition  function  outputs  a membership  value,  it  is 
easy  to  set  a threshold  to  reduce  the  number  of  ^S's  and  hence  to  speed 
up  the  processing  time.  The  NBS's  denied  by  the  thresholding  will  not 
be  processed  any  further.  But  if  we  do  not  set  a threshold,  all  possi- 
ble NBS's  will  be  processed  even  if  some  of  them  correspond  to  very 
small  membership  values.  The  idea  and  implementation  of  membership 
thresholding  are  obvious,  so  they  are  not  discussed  in  detail  here. 

If  the  generalized  recognition  functions  are  used  in  our  PEE 
parser,  whenever  a S-production  is  used,  an  answer  consisting  of  c and  s 
will  be  obtained.  It  indicates  that  the  shape  is  recognized  by  the 
grammar  labelled  c with  membership  value  s.  If  there  is  wore  than  one 
answer,  we  can  select  an  answer  with  the  highest  value  of  s which  sup- 
posedly corresponds  to  the  feasible  NBS  that  fits  a derivation  tree  the 


* 


A. 
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best. 

5^  Error-Correcting  Techniques  and  General ized  PEE  Parsing 

In  the  previous  section,  the  parsing  continues  even  if  zero  mem- 
bership values  are  obtained.  That  is,  if  some  parts  of  the  shape  are  so 
badly  distorted  that  they  cannot  be  recognized  by  primitive  or  nontermi- 
nal recognition  functions,  the  parsing  will  still  continue,  but  with 
smaller  membership  values.  In  a sense,  this  idea  is  similar  to  the 
error-correcting  technique  discussed  in  C30,31,32D.  The  error- 
correcting  parsing  considers  the  noisy  symbolic  patterns  containing 
three  types  of  errors.  They  are  substitution,  deletion,  and  insertion 
errors.  We  will  discuss  these  error  types  together  with  the  special  na- 
ture of  shape  boundaries,  since  shape  is  our  major  interest  in  this 
study. 

If  some  parts  of  the  shape  are  so  distorted  that  they  are  misrecog- 
nized  as  wrong  primitives,  these  wrong  primitives  can  be  considered  as 
substitution  errors  by  error-correcting  techniques.  In  such  a case,  the 
error-correcting  technique  will  measure  the  distortion  by  the  number  of 
substitution  errors  and  the  generalized  PEE  parsing  will  measure  the 
distortion  by  membership  functions.  They  both  can  handle  this  kind  of 
distortion. 

In  fact,  the  two  techniques  attack  the  problem  of  noisy  patterns  by 
different  approaches.  The  generalized  PEE  (GPEE)  parsing  can  compute 
the  degree  of  distortion  for  each  primitive  or  nonterminal  via  the  mem- 
bership function.  Since  the  membership  functions  deal  with  the  boundary 
chain  and  the  descriptors  of  the  primitives  and  nonterminals  directly, 
it  is  easier  to  obtain  a membership  value  which  reflects  accurately  the 
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similarity  between  the  noisy  shape  and  the  shape  it  is  supposed  to  be. 
The  number  of  errors  counted  by  the  error-correcting  (EC)  parsing  is  not 
accurate  enough  to  reflect  their  similarity.  Recently,.  Lu  and  Fu  [87] 
studied  the  stochastic  error-correcting  (SEC)  technique  for  recognizing 
noisy  symbolic  patterns.  Through  the  probabilities  assigned  to  the  pro- 
duction rules,  the  SEC  parser  can  compute  the  likelihood  function  of  a 
noisy  pattern.  This  was  an  important  advance  in  recognizing  noisy  sym- 
bolic patterns. 

Example  5^ 

Figure  5.1  shows  the  top  view  of  a simple  airplane  distorted  by  a 
cloud  on  the  right  wing.  The  true  shape  can  be  found  in  Figure  3.6. 
The  cloud  on  the  right  side  creates  an  extra  area  attached  to  the  air- 
plane. The  boundary  of  this  extra  area  is  very  noisy.  Although  some 
filtering  techniques  may  be  used  to  smooth  out  the  zig-zags  of  the  boun- 


dary, the  shape  of  this  extra  area  is  unpredictable  because  it  changes 
with  the  cloud  and  the  shape  of  a cloud  is  unpredictable.  In  such  a 
case,  the  primitive  extraction  on  the  boundary  of  this  extra  area  be- 
comes a difficult  problem.  One  way  of  solving  this  difficulty  is  to  use 
an  extra  error  primitive  to  represent  the  noisy  curve  segments  which 
cannot  be  recognized  as  any  legal  primitive.  And  then,  add  the  error 
production  rules  for  this  extra  error  primitive  in  all  possible  cases. 
But,  the  length  of  this  extra  error  primitive  is  still  undefined,  since 
the  boundary  length  of  the  extra  area  of  the  cloud  is  unpredictable.  A 
primitive  without  any  specification  on  its  attributes  will  cause  confu- 
sion in  extractions.  For  instance,  the  noisy  boundary  of  that  extra 
area  may  be  considered  as  one  or  several  consecutive  noisy  curve  seg- 


merits  or  error  primitives.  If  the  distorted  boundary  (X.XO  is  extract 


ed  as  one  error  primitive,  the  erroneous  symbolic  pattern  can  be  accept 


ed  by  an  EC  parser  or  SEC  parser  which  allows  only  substitution  errors 


But  there  is  no  existing  method  which  can  systematically  achieve  such  an 


extraction  without  using  contextual  information.  If  the  distorted  boun' 


dary  is  extracted  as  more  than  one  error  primitive,  the  erroneous  sym 


bolic  pattern  has  to  be  accepted  by  an  EC  parser  or  SEC  parser  which  al 


lows  substitution  and  insertion  errors.  If  X.X-,  is  very  long,  as  shown 


in  Figure  5.1,  due  to  distortion,  it  may  be  extracted  to  be  k error 


Then,  the  number  of  errors  counted 


by  an  EC  parser  may  change  with  k,  and  the  probability  computed  by  the 


SEC  parser  of  a given  grammar  generating  the  erroneous  symbolic  pattern 


dition  of  X.X,  should  not  change  the  probability  or  the  error  count 


which  may  affect  the  classification.  Since  a human  can  recognize  Figure 


5.1  by  the  unaltered  portion  of  the  shape,  the  similarity  between  Fig 


ures  3.6  and  5.1  should  closely  approximate  the  percentage  of  the  unal 


tered  portion  of  the  shape.  The  GPEE  parser  can  segment  the  shape  in 


Figure  5.1  at  the  seme  positions  as  the  true  shape  in  Figure  3.6,  but 


will  have  a small  membership  value,  and 


the  distorted  portion 


hence,  the  membership  value  of  the  whole  distorted  shape  will  also  be 


smaller.  If  the  distorted  X.X,  has  a zero  membership,  then  the  mem 


bership  value  of  the  whole  shape  will  remain  unchanged,  no  matter  how 


badly  X.X5  is  distorted.  In  other  words,  the  GPEE  parser  can  fully  use 


the  contextual  information  to  extract  the  primitives,  so  that  a distort 


ed  shape  like  Figure  5.1  can  be  handled  by  only  one  substitution  error 
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Besides,  the  associated  attribute  information  can  be  used  to  obtain  a 
membership  value  which  is  close  to  human  perception. 

. 4 t 9 

This  example  explains  the  advantages  of  GPEE  parsing  over  the  EC  or 
SEC  parsing  in  extracting  primitives  and  reasonable  membership  values. 
But  the  error-correcting  parsing  is  more  powerful  than  the  generalized 
PEE  parsing  in  some  cases,  such  as  when  the  shape  is  distorted  in  such  a 
way  that  some  parts  disappear.  For  instance,  an  airplane  tail  is  miss- 
ing because  it  is  outside  the  visible  field.  The  boundary  is  shorter 
and  the  number  of  vectors  is  smaller.  There  may  not  be  a feasible  seg- 
mentation of  the  vector  chain  corresponding  to  any  possible  derivation 
tree.  In  such  a case,  the  generalized  PEE  parsing  will  stop,  but  the 
error-correcting  parsing  can  recognize  the  bad  shape  by  allowing  dele- 
tion errors.  In  other  words,  if  a whole  primitive  disappeared  from  the 
boundary,  the  GPEE  parser  could  not  handle  such  a deletion  error. 

However,  these  two  techniques  can  be  combined  together  to  obtain  a 
Generalized  Error-Correcting  PEE  (GECPEE,  for  short)  parsing  which  has 
all  of  the  merits.  As  mentioned  before,  a membership  threshold  can  be 
set  to  deny  recognition  of  some  primitives  and  nonterminals.  Without 
losing  any  generality,  we  set  the  threshold  to  rQ,  i.e.,  all  the  recog- 
nitions with  membership  less  than  r will  be  rejected.  As  discussed 

o 

previously,  the  error-correcting  technique  can  be  used  to  take  care  of 
these  rejections  as  well  as  the  deletion  errors. 

The  error-correcting  technique  with  weighted  error  has  been  recent- 
ly studied  by  Fu  and  Lu  C84].  Since  we  have  the  attributes  associated 
with  each  primitive  and  nonterminal,  we  can  select  proper  weights  for 
the  errors.  For  instance,  we  can  use  the  fourth  attribute,  the  propor- 


i 
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tional  length,  of  a curve  primitive  as  the  error  weight.  If  a primitive 


extraction  is  rejected  by  membership  thresholding  or  a curve  primitive 


is  found  deleted,  the  error-correcting  technique  is  applied  and  the  er- 


ror weight  indicates  length  of  the  distorted  or  deleted  portion  in  pro- 


portion to  the  whole  shape. 


Therefore,  any  arbitrary  vector  chain  can  be  processed  by  a 6ECPEE 


parser.  Given  an  arbitrary  shape  grammar,  the  parser  wilt  terminate 


with  two  values,  a membership  value  and  an  error  weight.  The  error 


weight  estimates  the  portions  of  the  shape  which  are  too  badly  distorted 


to  recognize.  The  membership  reflects  the  similarity  between  the  other 


recognized  portions  of  the  shape  and  the  corresponding  portions  of  a 


shape  generated  by  the  grammar.  There  may  be  many  sets  of  membership 


values  and  error  weights.  The  GECPEE  parser  can  always  select  the  one 


with  high  membership  and  low  error  weight  according  to  criteria  set  by 


the  user.  The  GECPEE  parser  attempts  to  find  a derivation  tree  which 


may  have  flaws  and  a segmentation  on  the  unknown  vector  chain  which  fits 


the  tree  the  best.  Figure  5.2  shows  a possible  derivation  tree  obtained 


by  the  GECPEE  parser.  The  X's  indicate  the  flaws  of  the  derivation 


tree.  Namely,  X's  indicate  where  the  recognitions  are  rejected  or  the 


primitives  are  deleted. 


5.5  Mod i f i ed  Versions  of  Earley's  Parser 


Using  a Valiant's  parser  C91D,  the  time  needed  to  recognize  a 


string  of  length  n generated  by  a context-free  grammar  is  at  most  a con- 


stant times  nd*  . But,  the  grammar  has  to  be  in  Chomsky  normal  form 


(CNF)  and  the  constant  is  so  large  that  the  Valiant's  parser  is  of  only 


theoretical  interest.  (A  CFG  G = (N,T,P,S>  is  in  CNF,  if  each  produc- 


*u  *■'* 


tion  in  P is  of  one  of  the  forms  (1)  A ♦ BC,  A,B,C,  e N,  (2)  A * a, 
a c T,  or  (3)  If  X e L(6)  then  S + \ is  in  P and  S does  not  appear  on 
the  right  side  of  any  production  [43/913.)  The  Cocke- Young er-Kasami 
parser  and  Earley's  parser  both  result  in  a time  bound  proportional  to 
n^  [913.  The  Cocke- Young er-Kasami  parser  also  requires  the  context-free 
grammar  in  Chomsky  normal  form. 

Earley's  parsing  algorithm  has  been  the  most  practical  and  effi- 
cient algorithm  for  context-free  grammars  in  general  form.  The  modified 
first  half  of  this  algorithm  was  the  PEE  Earley's  parser  for  our  shape 
recognition  in  Section  3.6.  Aho  and  Peterson  introduced  the  error- 
correcting  technique  to  Earley's  algorithm  in  1972  C303.  In  this  sec- 
tion, we  will  discuss  several  modified  versions  of  Earley's  algorithm 
for  shape  recognition.  We  will  describe  the  6PEE  Earley's  algorithm, 
our  version  of  error-correcting  Earley's  algorithm,  and  the  combined 
6ECPEE  Earley's  algorithm.  The  meaning  of  each  item  in  the  parse  lists 
of  different  versions  will  be  explained  as  well.  Again,  only  the  part 
of  parsing  table  generation  will  be  modified  and  discussed. 

5.5.1  The  Generalized  PEE  Earley's  Algorithm 

The  algorithm  in  Section  3.6  is  modified  to  use  the  generalized 
recognition  function  described  in  Section  5.2.  In  the  following  algo- 
rithms, the  subscript  r of  a nonterminal  or  a primitive  denotes  that  the 
corresponding  recognition  function  is  performed  and  the  membership  r is 
obtained.  The  flow-chart  of  the  generalized  PEE  Earley's  algorithm  is 
shown  in  Figure  5.3.  Each  item  in  the  parse  lists,  I's,  has  the  same 
implication  as  that  in  Section  3.6. 
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Algorithm  5.1:  The  Generalized  PEE  Earley's  Parser 

Input:  A shape  grammar  G,  recognition  functions  R's,  and  an  unknown 
chain  of  m vectors,  V^m. 

Output:  Membership  s = R(G,V^m+^). 

Method: 

(1)  Add  CS  ♦ • a,1]  to  I1  for  all  S * o in  Pt 

j = 1 

(2)  (a)  If  CN  ♦ a *BB,i]  is  in  Ij  and  B * y in  Pt 

then  add  CB  + • >,j]  to  1^ 

(b)  If  CN  + o • ,i]  is  in  I • 
r J 

then  for  al l CB  * B • N y,k]  in  1^ 
add  CB  ♦ BNr  • y,k]  to  Ij 

(3)  j = j+1 

if  j > m+1  goto  (5) 

For  all  CN  ■»  o • XB,i]  in  Ik,  1 < k < j 
X c <F's,A's> 

(a)  If  B t A then  r = R<x'vk,j* 

and  add  CN  * oXr  • B,i]  to  1^ 

(b)  If  B = A,  then  rl  = R(X,Vk^-),  r2  - R(NaX,V^j) 

and  add  CN^  ♦ aXp^  • ,i]  to  Ij 

(4)  Go  to  (2) 

(5)  Find  CSs  ♦ a *,13  in  1^  with  maximum  s. 

If  none  can  be  found,  then  s = 0 
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Figure  5.3  The  Flow-Chart  of  Generalized  PEE  Earley's  Algorithm 


then  add  IB  * • y,j3  to 


(b)  If  LNr  ♦ a • ,i3  is  in  I- 

then  for  all  [B  ♦ 6 • N y,k3  in 
add  CB  ♦ BNr  • >^k D to  I. 


Find  CSs  ♦ o *#13  in  I>4j 
with  aanaua  S. 

If  none  can  be  found,  then  s * 0 


and  add  CN  * <*Xr  • B,i 3 to  I. 

<b)  If  B * »,  then  rl  * R(X,V.  .),  ri  * R<n“*,V.  s) 
»#)  '/J 

and  add  ♦ aX^  • , i3  to  I. 


Add  CS  ♦ • a, 13  to  I.  for  all  S 
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5.5.2  The  Modified  Error-Correcting  Earley's  Algorithm 

In  Section  5.4,  we  mentioned  that  the  error-correcting  technique 
can  be  applied  to  recognize  badly  distorted  shapes.  Aho  and  Peterson 
C30D  added  the  error-correcting  capability  to  Earley's  algorithm  by  ad- 
ding the  error  productions  to  the  production  rule  set,  and  modifying  the 
algorithm  to  track  the  error  productions  used.  Fu  and  Lu  C84J  further 
introduced  the  idea  of  weighted  error  to  each  error  production.  If  the 
error  weights  of  all  the  error  production  rules  are  equal  to  1,  the  to- 
tal error  weight  will  be  the  number  of  error  productions  used.  If  some 
particular  errors  are  impossible  or  not  allowed,  the  corresponding  error 
productions  will  not  be  added  to  the  production  rule  set.  Therefore, 
the  advantage  of  using  error  productions  is  that  it  controls  the  allow- 
able errors,  but  the  drawbacks  are  the  large  storage  and  processing  time 
required,  especially  when  the  number  of  primitives  is  large. 

If  the  deletion,  substitution,  or  insertion  errors  are  possible  or 
allowed  to  occur  for  all  primitives  at  any  place  in  the  string,  we  do 
not  need  to  add  the  error  productions  to  the  production  rule  set  in  the 
actual  implementation.  But  we  do  need  to  modify  the  Earley's  algorithm 
to  take  care  of  all  three  types  of  errors.  In  this  section,  we  show  the 
feasibility  of  this  idea  for  processing  symbolic  patterns.  No  PEE 
mechanism  is  involved.  In  the  following  algorithms,  we  assume  the  error 
weights  for  primitives  are  independent  and  additive.  ¥(•)  denotes  the 
error-weighting  functions  with  subscripts  D,  I,  and  S indicating  dele- 
tion, insertion,  and  substitution  errors.  X denotes  the  primitive, 
e.g.,  Wc(X,b.)  means  the  error  weight  of  b^  substituting  X.  Step  (2.c) 
in  the  following  algorithm  takes  care  of  the  deletion  errors,  while  step 
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(3.b)  takes  care  of  the  insertion  and  substitution  errors.  The 
corresponding  flow-chart  is  in  Figure  5.4. 

An  item  [A  ♦ a*B,i,iJ  is  in  r for  input  string  b.j...bn, 
OjCi£j£n,  if  and  only  if 
* 

(1)  a - Vi  •••  ak 

tm 

(2)  The  difference  between  a ^ ...  ak  and  bi+1  ...  bj 

1S  {ad1',,,'3dD'bi1',,,'biI' 

^ a$1  /bsi ) / • • * ^asS'bsS^'  where 
ad1  ...  adD  are  the  primitive  deleted, 
b^  ...  b^j  are  the  primitive  inserted,  and 
(as1'bs1)'  ”•  ^sS'hsS5  ’ndicate  the  substitution  errors 

(3)  a)  = £ ¥D(adx>  + D.  ¥I(bix) 

X=1  X=1 

+ £ ¥S(asx'bsx> 

X — 1 

Algorithm  5.2:  Modified  Error-Correcting  Earley's  Parser 

Input:  An  ordinary  context-free  grammar  6 = (V,T,P,S),  error-weighting 

functions  "B(*) ' s and  an  unknown  input  string,  b^ bn. 

Output:  Error-weight,  u. 

Method: 

(1)  Add  CS  ♦ *a,  0,  0]  to  Iq  for  all  S + a in  P 

j • i = 0 

(2)  (a)  If  CA  ♦ a*BB,  i ,<i>3  is  in  1^  and  B + y is  in  P 

then  add  CB  ♦ • y,j,0D  to  1^ 

(b)  If  CA  + a *,i,w^3  is  in  I- , 

then  for  all  CB  ♦ B*  A Y,k,w2]  in  I- 


> 


k 


■■ 
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add  CB  ♦ BA  • Y,k,u>1+(D2]  tc  1^ 

(c)  If  CA  ♦ a* XB/i /uD  is  in  L 
and  X * bj+i'  3+1  <.  n, 
then  add  CA  + aX*6/i/u+l?D(X)D  to  Ij 

(3)  j = j+1,  if  j > n go  to  (5) 

For  all  CA  ♦ a*XB/i,u>3  in  r_,j 

(a)  If  X = b.  then  add  CA  aX*B/i,u>]  to  1^ 

(b)  If  X / b.  then  add  CA  ♦ a*XB,i  ,<d+Wj  (b^  )D 

and  CA  ♦ aX*B/i#u+¥_(X,b,)]  to  I. 

^ J J 

(4)  Go  to  (2) 

(5)  Find  CS  ♦ a in  I with  minimum  id. 

n 

In  the  above  algorithm,  all  the  possible  errors  are  considered.  If 

we  know  beforehand  that  a shape  of  total  error  weight  higher  than  a 

threshold  will  not  be  accepted,  we  can  threshold  the  error  when  an  item 

is  added  to  the  item  lists  to  eliminate  a large  number  of  possibilities. 

For  instance,  if  we  do  not  accept  a shape  of  error  higher  than  u>  = 0.4, 

o 

then  we  only  need  to  consider  items  of  form  CA  ♦ o*B,i,oj],  in  which 

id  < id  . In  such  a case,  step  (5)  may  end  up  with  no  CS  ♦ a*,  0,  <dD  in 
— 0 

I for  id  < (d  . That  means  the  input  pattern  is  rejected  by  the  grammar 
n — o 

because  no  derivation  with  error  smaller  than  u>  can  be  found. 

o 

As  mentioned  in  Section  5.4,  we  may  use  the  proportional  length  of 
a primitive  to  the  whole  shape  attribute  as  the  error  weight  for  each 
curve  primitive  in  our  shape  attribute  as  the  error  weight  for  each 
curve  primitive  in  our  shape  recognition.  In  this  case,  an  item 
CS  ♦ a*,  0,  u)D  in  Ip  implies  wxIOOX  of  the  shape  is  badly  distorted. 
The  selection  of  threshold  (d  may  affect  the  computational  time  and 
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recognition  result.  The  smaller  u>  is,  the  shorter  is  the  computation 

0 

time.  But  a shape  with  distortion  greater  than  u>  xIOOX  of  what  the 

0 

shape  is  supposed  to  be  will  be  rejected. 


5.5.3  6ECPEE  Earley's  Algorithm 

As  mentioned  in  Section  5.5.1,  the  PEE  Earley's  parser  has  been 
modified  by  using  the  generalized  recognition  functions.  This  modifica- 
tion is  for  shape  recognition  only.  In  this  section,  we  will  introduce 
the  error-correcting  technique  to  the  generalized  PEE  Earley's  algo- 
rithm. Again,  the  6ECPEE  parser  will  be  for  shape  recognition  only. 
Therefore,  only  the  special  input  of  vector  chains  and  the  special  na- 
ture of  shape  errors  described  in  Section  5.4  are  considered. 

Although  the  error-correcting  algorithm  in  Section  5.5.2  takes  care 
of  all  three  types  of  error,  deletion,  insertion,  and  substitution,  our 
GECPEE  Earley's  algorithm  will  consider  only  deletion  and  substitution 
errors  since  insertion  errors  will  not  occur,  if  the  boundary  vector 
chain  is  correctly  segmented  by  the  PEE  mechanism.  See  Example  5.1  in 
Section  5.4. 

In  the  following  algorithm,  step  (2.c)  takes  care  of  deletion  er- 
rors and  step  (3. a)  takes  care  of  substitution  errors.  Steps  (3.b)  and 
(3.d)  take  care  of  the  successful  primitive  and  nonterminal  recogni- 
tions. Step  (3.c)  implies  that  if  the  recognition  of  primitive  X is 
successful,  that  of  nonterminal  N is  not  successful  and  u = 0,  the 
recognition  of  X will  not  be  accepted.  That  is,  if  all  the  successors 
of  N are  successfully  recognized,  the  recognition  of  N must  succeed. 
The  recognition  of  primitive  or  nonterminal  is  successful  if  and  only  if 
the  membership  value  r is  greater  or  equal  to  threshold  rQ. 


< 


An  item  CN  + a*B,  i/  <d3  in  I.  for  input  vector  chain  v 


a ..>  are  del eted 


)>  are  substitution  errors 


sx'  1X,]X 


The  flow-chart  of  this  GECPEE  Earley's  algorithm  is  shown  in  Figure 


Algorithm  5 .Z:  The  GECPEE  Earley's  Algorithm 


Input:  A CFSG  G,  recognition  function  R's,  membership  threshold  vq, 
erroi — weighting  functions  W(*)'s,  and  an  unknown  chain  of  m vec- 


tors V1  . 

1 ,m 

Output:  Membership  s = R(G,V1  m+1>/  and  error-weight  <d  = summation 
weights  of  deletion  and  substitution  errors. 


Method 
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Cc)  If  CN  ♦ o*XB,  1/  <c3  in  1^,  x c <F's,A's} 
then  add  CN  ♦ aXp’B,  i,  u>+¥D(X)3  to  Ij 


(3)  j = j+1 

if  j > m+1  goto  (5) 

For  all  CN  ♦ a*XB,  i,  <*>3  in  1^^  1 £ k £ j,  X c <F's,A's} 

(a)  If  B * X,  r = R(X,V.  .)  < r 

*/)  c 

then  add  CN  * aXp»,  i/  «+Ws(X/V((  j)3  to  Ij 

(b)  If  B = X,  rl  = R(X,V.  .)  > r and  r2  = R(Nax,V.  .)  > r 

k,j  — 0 i/J—o 

Cc)  If  b = X/  u > 0/  rl  = R(X,V.  ■)  > r 

K/J  — o 

and  r2  = R(NaX,V.  .)  < r 
1/3  o 

then  add  CM^  * aXri*/  i / «3  to  Ij 

(d)  If  b = X,  and  r = R(X,V.  .)  > r 

*/)  — o 

then  add  CN  ♦ aXr*B,  i , u>3  to  1^ 

(e)  If  B = X,  and  r = R(X,V.  .)  < r 

*0/3  o 

then  add  CN  ♦ aXp*B,  i/  ui+¥g<X/V^  ^>3  to  r 

(4)  go  to  (2) 

(5)  Find  CS  ♦ a*.  1,  u>3  in  I . with  high  s and  low  w. 

s m+1 


_5.6_  Experimental  Resul  ts 

In  Section  5.5,  several  versions  o*  Earley's  algorithm  were 
developed  and  discussed.  We  have  implemented  the  GPEE  Earley's  algo- 
rithm and  the  GECPEE  Earley's  algorithm  for  our  particular  interests, 
distorted  shape  recognition.  Like  other  existing  error-correcting  tech- 
niques, the  above  two  algorithms  both  have  the  disadvantages  of  large 
storage  and  long  computational  time.  In  GPEE  parsing  we  need  general- 
ized recognition  functions  for  primitives  and  nonterminals.  The  0-1 


Figure  5.5  The  Flow-Chart  of  GECPEE  Earley's  Algorithm 
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♦ oXrj*»  i/  m3 

to  Ij 

(d) 

If 

B = X 

, and 

r = R(X,V.  .)  > 

” / J 

r 
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recognition  functions  described  in  Assumptions  4.1  and  4.2  can  be  con- 


sidered as  rectangular  pulse  functions  in  a 4-dimensional  transformed 


space.  Intuitively,  we  may  generalize  the  step  recognition  functions  by 


using  trapezoidal  functions  or  Gaussian  functions.  Unfortunately,  these 


various  generalizations  are  intuitive  and  lack  theoretical  support. 


Further  study  of  distorted  shapes,  gradual  shape  changes,  and  human  per- 


ception of  shape  similarities  may  help  us  construct  reasonable  general- 


ized membership  functions.  Therefore  we  used  the  0-1  recognition  func- 


tions in  our  implemented  GECPEE  Earley's  algorithm.  Thus,  we  actually 


implemented  an  EC PEE  Earley's  algorithm  which  is  a special  case  of  GEC- 


PEE algorithm. 


When  a human  recognizes  a partially  distorted  shape,  he/she  relies 


on  the  unaltered  parts  of  the  shape.  The  larger  the  distorted  part  is. 


the  more  difficult  the  recognition  will  be.  In  another  words,  the  less 


confidence  the  recognition  will  have.  If  the  recognition  of  a part  of  a 


shape  is  unsuccessful,  it  does  not  matter  hew  bad  the  distortion  is. 


Therefore,  we  defined  the  error-weighting  function  as  follows. 


WD(X>  = WS(X,V^.)  = £-<X)  for  X e -CF' - 


■£— (X)  is  one  variable  in  the  transformed  descriptor  of  curve  primitive 


X.  -j— (X)  represents  the  bad  boundary  length  in  proportion  to  Lq,  the 


total  length  of  a good  shape.  Of  course,  L and  Lq  refer  to  the  primi- 


tives defined  in  the  grammar,  so  that  they  are  known.  But  the  distor- 


tion of  the  unknown  input  pattern  may  change  the  total  boundary  length 


considerably  and  unpredictabl y.  The  total  boundary  is  required  by  the 


Transformation  T to  perform  the  size  normalization,  as  discussed  in 
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Chapter  3.  In  our  experiment,  we  assume  that  the  total  boundary  length 
can  be  obtained  by  other  means.  This  assumption  is  reasonable  because 
the  distance  from  the  object  to  the  camera  can  usually  be  estimated  ac- 
curately. For  instance,  the  distance  between  a flying  airplane  and  an 
observer  can  be  estimated  by  a radar.  Distances  under  water  can  be  es- 
timated by  a sonar.  Once  the  distance  of  a given  object  is  known,  the 
total  boundary  length  of  a shape  can  be  calculated  using  a proportional 
relation.  Therefore,  we  assume  that  the  total  boundary  length  is  known 
and  use  it  as  a size  normalization  factor. 

Since  ideally  the  angle  primitive  has  no  length,  we  defined 
"W-CA)  = ¥_(A, V-  • ) = 0.  But,  a curve  primitive  is  recognized  only  if 

V O 1 /] 

one  of  its  two  adjacent  angle  primitives  shown  in  the  production  rule  is 
recognized.  So,  the  angle  primitive  recognition  itself  has  no  error 
weight,  but  it  is  used  as  a condition  for  adjacent  curve  primitives  in 
our  implementation.  With  these  error-weighting  functions,  the  final  er- 
ror weight  w of  CSs  ♦ a*,  1,  «]  in  1^  implies  a>x100%  of  the  shape  is 
badl y di storted  . 

The  0-1  membership  functions  we  used  for  primitive  and  nonterminal 
recognition  are  the  same  as  those  described  in  Section  4.5.  (See  As- 
sumptions 4.1  and  4.2.)  The  recognition  functions  are  based  upon 
transformed  descriptors  only.  Therefore,  the  membership  value  is  always 
1 if  the  recognition  succeeds.  As  mentioned  in  Section  5.5.2,  we  set  an 

error  threshold  u to  reduce  the  number  of  items  in  parse  lists  and 
o 

hence  reduce  the  computation  time. 

In  fact,  we  use  two  other  strategies  in  the  implementation  to 
reduce  the  number  of  possibilities  and  computing  time.  One  is  deduced 


\ 
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from  the  error  threshold  <*>o.  Since  implies  an  error  percentage 

greater  than  u^xIOOX  will  not  be  accepted,  we  may  as  well  say  the  boun- 
dary recognized  must  be  greater  than  (1-aOxlOOX.  Therefore,  when  we 
are  processing  the  i-th  vector  of  chain  V.  , the  boundary  recognized  in 

I /HI 

V,  . plus  the  boundary  to  be  processed  in  V.  must  be  greater  than 

l/i  1 /In’*'! 

(1-wo)x100X  of  the  whole  shape.  The  other  strategy  uses  the  lookahead 
information  to  reduce  the  number  of  wrong  segmentations  when  processing 
the  distorted  portion  of  the  shape.  That  is,  in  Algorithm  5.3,  step 
3(e)  says, 

"For  all  CN  ♦ a*XB,  i,  u>]  in  Ifc,  1 < k <j  X c •CF,s,A's>, 
if  B * X and  r = R(X,V.  •)  < r 

K ,J  0 

then  add  CN  •*  aXn*B,  i,  u+¥.(X,V.  -)3  to  I.  ." 

U o K/J  J 

We  put  in  one  more  condition,  i.e.,  if  B = Yy,  Y e CF's,A's>, 

r = R(X,V.  .)  < r and  rl  = R(Y,V-  ) > r for  some  t > j then  add 
K/j  0 J /*  — 0 — 

CN  ♦ a X_ • B , i,  w+Wc(X,V.  .)]  to  I..  All  these  strategies  make  the  fol- 
U 0 K,J  J 

lowing  experiments  possible  under  the  limitations  of  memory  space  and 
computing  time. 

As  a matter  of  fact,  we  implemented  two  versions  of  the  ECPEE 
Earley's  parser.  Version  1 allows  only  substitution  errors,  i.e.  step 
2(c)  of  Algorithm  5.3  is  taken  away  to  not  allow  deletion  errors.  In 
this  version,  the  latter  one  of  the  above  two  strategies  is  not  imple- 
mented in  the  program.  Version  2 allows  both  deletion  and  substitution 
errors.  Both  strategies  are  implemented  in  version  2. 


. -...‘Ll  • ... 
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Experiment  5^ 

Figure  5.6  is  a picture  of  a 652  in  a cloudy  sky.  It  is  obtained 
by  adding  a picture  of  clouds  to  a picture  of  an  arbitrary  angle  view  of 
the  B52  shown  in  Figure  4.10(a).  Because  the  ciouc  on  the  right  side  is 
smoothly  connected  to  and  is  as  dark  as  the  airplane,  there  is  no  way  to 
delineate  the  boundary  between  the  cloud  and  the  airplane  without  know- 
ing the  model  and  viewing  angle  of  the  airplane  first.  Through  the  pro- 
cedures of  threshold  finding,  boundary  following,  and  smoothing  we  ob- 
tained a shape  shown  in  Figure  5.7.  The  vector  chain  in  Figure  5.7  has 
120  vectors. 

Two  versions  of  ECPEE  Earley’s  parser  were  implemented  on  Purdue 
University’s  CDC  6500.  Both  versions  can  have  no  more  than  8000  items 
in  the  parsing  table.  Each  item  requires  3 60-bit  words.  Thus,  the 
whole  parsing  table  occupies  24K  words,  which  is  about  60Ko  words.  The 

program  itself  occupies  about  26K_  words.  So,  a total  of  about  106Ko 

O O 

60-bit  words  is  required  to  run  on  the  CDC  6500. 

The  chain  of  120  vectors  (see  Figure  5.7)  was  fed  into  both  ver- 
sions of  the  ECPEE  Earley's  parser  with  maximum  error  weight  to^  = 0.10 

and  shape  grammar  Gg^  A (see  Experiment  4.1).  Version  2 which  allowed 
both  substitution  and  deletion  errors  ran  out  of  parsing  table  space  and 
could  not  get  an  answer.  Version  1 recognized  Figure  5.7  as  B52,A  with 
error  weight  0.09  in  140  seconds.  The  same  vector  chain  was  fed  into 
version  1 with  u>q  = 0.10  and  shape  grammar  Gp.^  a ^see  ExPerin,ent  4.1). 
The  vector  chain  was  rejected  by  Gp,^  a ^ seconds.  Then,  we  fed 
the  vector  chain  to  version  1 again  with  <oq  = 0.15  and  shape  grammar 
GB52  A"  Ttie  Pars’n9  table  ran  out  of  space  before  an  answer  could  be 





■■  • v 


- 169  - 


obtained.  Since  the  shapes  of  652/A  and  F102/A  are  structurally  dif- 
ferent, the  shape  in  Figure  5.7  car  be  accepted  by  Gp^  A only  if  a 
small  portion  of  the  outer  boundary  is  '•ecognizea  as  a one  or  two  primi- 
tives 0f  GF102,A  and  the  rest  of  the  shape  is  processed  as  distorted 
portion.  Very  possibly.  Figure  5.7  can  never  be  accepted  by  . be- 

• 1 Uc f A 

cause  no  primitive  of  Gp.^  ^ can  be  extracted  from  Figure  5.7. 

This  experiment  demonstrated  the  recognition  capability  of  our  EC- 
PEE  parsing  on  a badly  distorted  shape  such  as  that  in  Figure  5.7.  Al- 
so, this  experiment  revealed  that  the  number  of  items  in  the  parsing 

table  increased  with  a>  . 

o 

Experiment  5^.2 

This  experiment  demonstrates  the  recognition  capability  on  another 
type  of  distortion.  When  some  parts  of  the  shape  are  missing,  we  may 
need  to  allow  both  deletion  and  substitution  errors.  In  this  experi- 
ment, we  will  deal  with  some  shapes  of  which  some  parts  have  disap- 
peared. Therefore  we  need  to  use  version  2 of  the  ECPEE  Earley’s 


parser, 


Figures  5.8  and  5.10  are  two  F86,T’s  with  a broken  nose  and  a bro- 
ken right  wing  respectively.  After  the  preprocessing,  figures  5.9  hav- 
ing 65  vectors  and  5.10  having  72  vectors  are  obtained.  The  starting 
points  of  the  shapes  are  the  leftmost  points  in  the  figures.  The  vector 
chains  describing  Figures  5.9  and  5.10  are  fed  into  version  2 of  ECPEE 

Earley's  parser  with  u>  = 0.15  and  shape  grammar  Gco,  - and  GMT,  ... 

o rco,  I nib— lp,U 

(See  Experiment  4.2.)  Figures  5.9  and  5.11  are  accepted  by  Gco,  T with 

TOO,  I 

error  weights  0.10  and  0.14  respectively  and  are  both  rejected  by 


l! 


GMIG-15  U'  Process’n9  times  for  Gpg^  T accepting  Figure  5.9  (having 


ng  Figure 
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Table  5*1  Experimental  Results  of  Processing  Distorted  F86,T's 
with  ECPEE  Earley's  Parser  against  Gpg,  _ and 
Gu,_  ir  ..  with  Different  Error  Threshold  u 's 

M I u “ I ;>  t U 0 


Error  \ Shape 

Weight  and  \ 
Processing  \ 
■V^^Time  \ 

G ramma  r \ 

Figure  5-9: 

F86.T  with  a 

Broken  Hose 

(65  vectors) 

Figure  5 • 1 1 : 

F86.T  with  a 

Broken  Wing 
(72  vectors) 

o 

o 

ft 

3 

(o  = 0.14 

F86,T 

10  =0.15 

143  sec. 

117  sec. 

0 

r,F86,T 

(o  m 0.10 

(o  = 0.14 

a)  =0.20 

0 

375  sec. 

370  sec. 

RF86,T 

Exceeds  the 

Exceeds  the 

u>  =0.25 

Parsing  Table 

Parsing  Table 

0 

Limit 

Limit 

r,HIG-15,U 

Rej  ect 

Reject 

lo  =0.15 

52  sec. 

64  sec. 

O 

GMIG-15,U 

Reject 

Reject 

w = 0.20 

126  sec. 

165  sec. 

O 

P,M  1 G-  15,U 

Reject 

Reject 

w =0.25 

346  sec. 

503  sec. 

0 
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65  vectors)  and  Figure  5.11  (having  72  vectors)  are  143  seconds  and  117 

seconds  (CDC  6500)  respectively.  And  the  times  for  6MIG_15  y rejecting 

Figure  5.9  and  Figure  5.11  are  52  seconds  and  64  seconds  respectively. 

We  have  tried  the  experiment  with  different  error  thresholds  *s.  The 

o 

result  can  be  found  in  Table  5.1.  The  experimental  results  show  that 

the  selection  of  u affects  the  computation  time,  but  does  not  affect 

0 

the  recognition  results  at  all.  Of  course,  a very  smal l so  may  reject 

some  recognizable  shapes  and  a very  larger  may  take  a long  computing 

o 

time  and  a large  memory  for  parsing  table  to  recognize  or  reject  a 
shape. 

Experiment  5_._3 

We  mentioned  in  Experiment  5.1  that  the  processing  time  and  the 

number  of  items  increase  withs  . Figure  5.12  shows  the  rear  half  of  a 

0 

MIG-1 5, U.  The  preprocessed  shape  having  40  vectors  can  be  found  in  Fig- 
ure 5.13.  The  starting  point,  the  leftmost  point  in  the  figure,  is  in 
the  middle  of  the  distorted  region.  It  is  processed  by  version  2 of  the 

parser  withs  = 0.5.  It  is  rejected  by  Gco,  T in  466  seconds,  and  ac- 
ts TOO,  I 

cepted  by  ig— 1 5 U w^th  error  “eight  0.48  in  267  seconds.  We  can  see 

that  the  processing  time  goes  up  withm  even  though  the  number  of  vec- 

o 

tors  is  less. 

Figure  5.13  can  still  be  recognized  as  MIG-15, U instead  of  F86,T 
although  the  distorted  portion  of  the  shape  is  large.  Because,  the  im- 
portant characters  of  the  narrow  width  at  the  fuselage  end  and  the 
straight  edge  of  the  rear  ends  of  the  wings  are  still  shown  in  the  shape 
for  classification.  Of  course,  the  maximum  error  weight  <dq  has  to  be 
reasonably  set,  so  that  the  badly  distorted  shape  can  be  recognized 


within  the  limitation  of  time  and  memory 


ji._?  Pi  sc  us  si  on 

As  mentioned  before,  none  of  the  other  existing  methods  can  recog- 
nize shapes  which  are  partially  distorted.  McKee  and  Aggarwal  [76] 
developed  an  automatic  system  to  learn  and  recognize  entire  shapes  and 
partial  shapes.  By  partial  shapes,  we  mean  only  part  of  the  shape  is 
visible.  McKee  and  Aggarwal  utilized  a library  to  store  the  edges,  ap- 
proximated and  described  by  straight  l ines,  of  each  shape  and  then, 
recognized  the  unknown  shapes  by  matching  the  edges  in  the  library. 
This  matching  method  can  recognize  the  partial  shape  only  if  the  set  of 
edges  belonging  to  the  object  is  known.  As  a matter  of  fact,  this  kind 
of  partial  shape  can  be  considered  a special  case  of  distorted  shapes. 

A very  straightforward  extension  of  our  distorted  shape  recognition 
involves  identification  of  touching  or  overlapping  objects.  In  biomedi- 
cal images,  touching  objects,  such  as  blood  cells  are  very  difficult  to 
separate.  If  two  touching  objects  are  in  different  classes,  the  whole 
shape  should  be  recognized  by  the  two  corresponding  grammars  with  two 
error  weights  respectively.  If  both  error  weights  are  lower  than  a 
threshold,  the  touching  assumption  is  correct.  The  same  argument  is 
also  true  for  more  than  two  touching  objects,  all  of  which  are  dif- 
ferent. If  two  touching  objects  are  in  the  same  class,  then  we  need  to 
do  further  processing  to  make  sure  there  are  two  objects  touching  each 
other.  This  further  processing  involves  the  second  part  of  parsing, 
parse  construction  from  the  parsing  table. 

In  constructing  parse  from  the  parsing  table,  we  can  find  all  the 
possible  segmentations  of  the  vector  chain.  If  there  are  two  segmenta- 
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tions  Y^  and  Y£,y^  = ^®11*” ant^  *2  = ^®21/*’*/S2n'^/  ^jl  ar*^  *jQ 

are  sets  of  subchains  s. .,  which  are  recognized  and  rejected,  respec- 

J 1 s 

tively.  So,  Y^  = y^q  u Y^  and  YjQ  O Y^  = $ empty  set  for  j = 1,2.  If 
there  are  no  overlapping  subchains  between  y^  and  Y 2^,  Y^  and  should 
be  two  objects  of  the  same  class.  If  the  two  error  weights  correspond- 
ing to  y.j  and  y^  are  both  lower  than  the  error  threshold,  the  whole  vec- 
tor chain  should  contain  portions  of  the  boundaries  where  the  two  ob- 
jects touch  each  other.  This  idea  can  be  extended  to  more  than  two 
touching  objects. 

Another  extension  of  distorted  shape  recognition  is  the  reconstruc- 
tion of  distorted  parts.  This  topic  will  be  discussed  in  Chapter  7. 

The  6ECPEE  parsing  is  a time-consuming  scheme  because  it  has  to 
consider  many  possible  cases  and  finally  obtain  the  segmentation  that 
best  fits  a derivation  tree.  Two  thresholds,  membership  threshold  and 
error-weight  threshold,  can  be  employed  to  reduce  the  possibilities  and 
computational  time.  However,  this  method  can  solve  the  distorted  shape 
recognition  problem  in  addition  to  all  the  other  recognition  problems 
which  can  be  solved  by  other  existing  methods. 

In  1975,  Persoon  and  Fu  C883  proposed  a sequential  classification 
algorithm  (SCA)  for  stochastic  context-free  languages.  The  SCA  is  a 
modified  Earley's  algorithm.  In  1977,  Lu  and  Fu  [87]  further  studied 
the  SCA  for  noisy  and  distorted  patterns.  This  algorithm  is  designed 
for  symbolic  patterns.  Of  course,  it  is  possible  to  use  the  sequential 
classification  idea  to  process  a string  of  vectors  which  describe  a 
shape.  Besides,  setting  a stopping  rule  and  a decision  rule  based  on 
the  membership  values  and  error-weights  obtained  in  processing  the  GEC- 
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PEE  parser  is  very  easy.  But,  as  mentioned  before,  a shape  may  have 
different  starting  points  due  to  the  different  scanning  directions  used 
in  finding  the  boundary.  If  the  starting  point  happens  to  be  in  the 
distorted  portion,  or  the  distorted  portion  is  immediately  adjacent  to 
the  starting  point,  then  the  sequential  classification  may  produce  false 
results.  This  situation  can  be  improved  by  shifting  the  starting  point 
or  scanning  along  different  directions,  and  finding  the  longest  recog- 
nizable portion  of  the  boundary  chain.  In  addition.  Ion-level  language 
programming,  such  as  assembly  language  programming  [973,  microprogram- 
ming [983,  and  parallel  processing  techniques  [99,1003  can  be  used  to 
speed  up  the  processing. 
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CHAPTER  6 

GRAMMATICAL  INFERENCE 

Introduction 

It  is  shown  in  Chapter  4 that  the  recognition  capability  of  our 
proposed  method  is  high,  when  proper  shape  grammars  are  provided.  Our 
shape  grammar  can  only  describe  2-dimensional  shapes,  while  actual  3-di- 
mensional objects,  e.g.  airplanes,  may  have  a number  of  different  2-di- 
mensional views.  Each  view  may  require  a different  set  of  production 
rules  to  describe  it.  We  can  manually  construct  a shape  grammar  for 
some  simple  object  which  has  only  a few  different  views.  But  for  com- 
plicated objects,  it  is  much  more  convenient  to  have  an  algorithm  which 
can  automatically  or  interactively  construct  a grammar  from  the  shapes 
extracted  from  digitized  pictures. 

Grammatical  inference  is  an  interesting  research  topic.  Fu  and 
Booth  [78]  surveyed  various  techniques  of  grammatical  inference  from 
symbolic  sentences.  If  the  primitives  can  be  extracted  correctly  by 
machine  without  knowing  the  contextual  information  described  by  produc- 
tion rules,  all  of  the  patterns  can  be  represented  by  symbolic  sen- 
tences. The  inference  techniques  described  in  [78]  can  then  be  used  to 
obtain  the  grammar.  For  the  shape  recognition  problem,  we  would  like  to 
infer  grammars  as  well  as  primitives  with  attributes  from  the  boundary 


vector  chains,  since  the  production  rules  and  primitives  are  flexible 
and  dependent  on  the  particular  shapes.  With  human  interference,  in- 
teractive  inference  can  be  done  easily.  In  the  following  sections,  we 
describe  an  automatic  and  an  interactive  inference  procedures  which  are 
basically  designed  for  the  shapes  of  a rigid  body  with  a fixed  appear- 
ance at  a certain  viewing  angle. 

In  Section  6.2,  we  will  discuss  an  automatic  learning  algorithm, 
ALA,  which  infers  shape  grammars  directly  from  the  noisy  shapes  whose 
classifications  are  known.  The  ALA  is  actually  implemented  in  FORTRAN 
and  an  experiment  will  be  described  in  Section  6.3.  But  the  shape  gram- 
mars inferred  automatically  may  not  use  the  nonterminals  to  represent 
semantically  meaningful  and  discriminative  portions  of  the  shape  as 
properly  as  those  inferred  manually  and  interactively.  Therefore,  an 
interactive  procedure  is  developed  and  discussed  in  Section  6. A.  As 
described  in  Chapter  4,  using  finite-state  shape  grammars  (FSSG)  is  more 
efficient  in  recognition  time  and  is  sometimes  as  accurate  as  using 
context-free  shape  grammars  (CFSG).  Thus,  the  conversion  from  CFSG  to 
FSSG  is  sometimes  worthwhile.  This  conversion  will  be  discussed  in  Sec- 
tion 6.5. 


6^  Automat  i c Inference  of  Shape  Grammar 

Since  the  shapes  obtained  from  digitized  images  usually  contain  a 
certain  amount  of  noise,  the  ALA  tries  to  infer  the  shape  grammars  such 
that  the  attributes  associated  with  the  primitives  and  nonterminals  are 
affected  by  the  noise  as  little  as  possible.  Based  on  the  random 
characteristic  of  the  noise,  the  ALA  accepts  more  than  one  boundary 
chain  for  one  view  and  uses  the  "averaging"  technique  to  reduce  the 
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noise  effect  on  the  attributes.  In  the  following  paragraphs,  the  tech- 
niques and  assumptions  employed  in  ALA  will  be  explained. 

The  ALA  provides  a systematic  way  of  decomposing  shapes.  That  is, 
the  training  shapes  are  decomposed  systematically  until  each  curve  seg- 
ment is  a curve  primitive.  To  do  so,  we  need  some  criteria  to  decide 
whether  a curve  is  decomposable.  If  it  is,  we  decompose  the  curve  into 
shorter  curves.  An  indecomposable  curve  is  designated  as  a NDC. 

Defi  ni  tion  6.1;  A curve  segment,  C = AB  = v,jV_...vn,  is  not  decompos- 
able, if  the  distance  from  any  point  on  the  curve  to  M3  is  less  than  E^, 
or  if  the  angle  change  at  any  point  along  the  curve  is  less  than  Efl.  E^ 
and  E are  arbitrary  small  positive  numbers. 

3 

If  E.  = 0 and  E = 0,  the  NDC  is  a straight  line.  If  £ =0  and  E. 

ha  an 

is  a small  number,  then  the  NDC  can  be  approximated  by  AB.  (See  Figure 

6.1(a)).  If  E.  = 0 and  E is  a small  number,  then  the  NDC  can  be  ap- 
h 3 

proximated  by  a very  smooth  curve  from  A to  B.  (See  Figure  6.1(b)). 
Definition  6.2:  A curve  segment  which  does  not  meet  the  requirements  of 
definition  6.1  is  decomposable. 

There  are  four  rules,  according  to  which  a decomposable  curve  is 
broken  into  shorter  segments. 

Decomposition  Rul es: 

Rule  1:  A decomposable  curve  can  always  be  decomposed  at  a point  which 
is  the  farthest  away  from  TlB  among  the  points  whose  angular 
changes  are  greater  than  E . 

3 

Rule  2:  For  each  decomposition,  an  associated  production  rule  is  con- 
structed and  added  to  the  production  rule  set. 


— 
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Rule  3:  The  decomposable  curves  and  NDC's  obtained  are  defined  as  non- 
terminals and  curve  primitives,  respectively. 

Rule  4:  Any  connection  between  two  curve  segments  is  defined  as  an  an- 
gle primitive. 

Because  of  noise,  we  try  to  match  more  than  one  shape,  so  that  the 
averaging  techniques  can  be  used  to  reduce  the  noise  effect  in  the 
descriptors.  Therefore,  the  ALA  is  input  with  more  than  one  shape  boun- 
dary with  respect  to  the  same  angle  view.  If  the  input  has  only  one 
shape  boundary,  it  is  indirectly  assumed  to  be  noise-free. 

The  boundaries  may  start  from  different  points.  A MATCH  routine  is 
used  to  search,  in  one  boundary  chain,  for  a point  which  corresponds  to 
the  starting  point  of  the  other  boundary  chain.  The  problem  can  be  for- 
mulated as  follows.  The  notation  - was  defined  in  Definition  3.10. 

Probl em  6.1:  Match  Problem 

V = v.v_...v  and  U = u„...u  are  two  boundary  vector  chains.  We 
1 2 n 1m 

must  find  i,  i',  n' , j,  j',  m'  with  1 £ i < i1  £ n'  < n+1  and 
1 _<  j ■<  j'  £ m'  < m+1,  such  that  the  curves 

r 


must  be  less  than  a certain  small 


positive  number  which  is  the  corner  tolerance 


Since  the  above  curves  may  not  be  primitives,  and  the  production 


rules  are  not  constructed  yet,  we  cannot  use  Assumption  4.1  directly  for 


all  the  curve  segments.  A modified  assumption  is  made  for  this  case 


Assumption  6.1 


such  that  we  can  decompose  V into  V 


1 or  (2)  V is  decomposable 


The  R functions  can  be  implemented  as  in  Assumption  4.1  for  curve 


segments  and  as  in  Assumption  4.2  for  angle  primitives.  If  V and  U,  or 


V.  and  V-,  are  NDC's  or  simple  curves.  Assumption  6.1  performs  a reason' 


ably  good  recognition.  If  V.  and  V,  are  rather  complex,  they  may  be  in- 


correctly recognized.  This  may  lead  to  incorrect  matching.  To  avoid 


we  can  alter  the  thresholds  by .decreasing  the  k's 


incorrect 


With  stricter  thresholds,  some  noisy  boundaries  may  not  be  matched.  The 


shapes  which  are  not  matched  will  be  treated  separately,  because  they 


- •'84  - 
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are  considered  so  noisy  that  they  need  different  primitives  and  produc- 
tion rules  to  describe  them.  This  situation  simply  increases  the  number 
of  production  rules  and  primitives  in  the  grammar. 

There  are  many  techniques  that  can  be  used  to  solve  the  match  prob- 
lem. We  implement  the  MATCH  routine  by  using  "DO"  loops  in  FORTRAN  and 
using  Assumption  6.1  to  search  for  the  match  for  every  pair  of  training 
patterns.  The  ALA  is  described  as  follows,  and  the  corresponding  flow- 
chart is  shown  in  Figure  6.2. 

Al gorithm  6.^:  The  ALA 

Input:  n boundary  chains  as  learning  sample  patterns. 

Output:  A context-free  shape  grammar  consisting  of  production  rules  and 
descriptors. 

Method : 

The  MATCH  routine  is  used  to  match  (n-1)  pairs  of  chains  and  all 
the  corresponding  points  are  memorized. 

Each  boundary  chain  is  decomposed  with  respect  to  the  possible 
starting  points  found  in  (1).  An  associated  S-production  rule  is 
constructed.  The  corresponding  curve  segments  indifferent  chains 
are  defined  using  the  same  nonterminal  or  primitive  symbols.  (The 
associated  descriptor  is  obtained  by  averaging  all  the  descriptors 
corresponding  to  the  same  symbol  at  step  (5)). 

Decompose  each  curve  segment  obtained  in  (2)  and  (3),  if  it  is 

■ 

decomposable,  according  to  the  Decomposition  Rules. 

If  no  decomposable  curve  can  be  found,  then  goto  (5)  otherwise  goto 
(3). 

Calculate  the  attributes  for  all  descriptors.  Terminate. 

I 


(1) 


(2) 


(3) 


(4) 


(5) 


! 


I itaBiifM 


* 


Are  there  decooposahle 
w curve  segments? 


Calculate  attributes 
for  all  descriptors 


Figure  6.2  The  Flow-Chart  of  ALA 
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As  discussed  in  Section  3.4,  in  order  to  recognize  the  boundary 
chain  starting  from  different  convex  points,  the  S-production  rules  are 
constructed  with  respect  to  different  probable  starting  points.  But  in 
some  cases,  the  boundary  has  smooth  convex  portions.  In  other  words, 
any  point  on  the  convex  curve  boundary  can  be  the  starting  point.  In 
such  cases,  it  may  take  much  space  to  store  and  much  time  to  process  all 
the  S-production  rules  corresponding  to  all  the  probable  starting 
points.  Actually,  we  only  use  S-production  rules  for  a finite  number  of 
most  probable  starting  points. 

The  ALA  assumes  that  the  most  probable  starting  points  should  ap- 
pear in  the  training  sample  set.  Therefore,  the  more  training  sample 
inputs  to  the  ALA  there  are,  the  more  complete  the  grammar  outputs  will 
be  from  the  ALA.  But  the  number  of  matching  pairs  n(n-1)/2  increases 
rapidly  with  the  number  of  training  samples  n.  Thus,  we  may  divide  the 
training  samples  into  a few  sets.  Each  set  of  training  patterns  would 
be  processed  by  ALA  separately,  and  then  combined  together  to  obtain  a 
complete  shape  grammar.  The  ALA  also  assumes  that  the  number  of  train- 
ing samples  in  one  set  is  so  large  that  the  random  noise  on  the  descrip- 
tors can  be  cancelled  out  by  averaging.  Besides,  the  corresponding  por- 
tions of  different  shapes  in  one  training  set  will  be  designated  with 
the  same  nonterminals  or  primitives.  Therefore,  the  more  training  sam- 
ples we  have  in  one  set,  the  more  accurate  and  compact  the  grammar  out- 
put from  the  ALA  will  be.  Let  us  see  an  example. 

Example  6 J : 

We  would  like  to  infer  a shape  grammar  for  a simple  shape  "L"  shown 
in  Figure  6.3.  With  respect  to  different  rotation  angles,  we  obtain  3 
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. 


I 

I I 


noisy  shapes  shown  in  Figure  6.4(a)-(c).  The  starting  points  are  at  the 


bottom  of  the  shapes.  These  three  shapes,  V = Vj...v^,  U = u1...u6,  and 


W = w^...Wg  serve  as  input  to  the  ALA  in  one  training  set.  The  ALA  has 
to  match  3 pairs  of  shapes  namely  (V,U),  (U,W),  and  (V,W)  in  step  (1). 
The  MATCH  routine  finds  out  that: 


(a)  V and  U can  be  matched  as  follows: 


curve  segments  v....v,  = U....U, 

• J H C 


v4**"v7  = U1**’U3 


angle  primitives  v,v.  = u,u, 
fl  3 4 


U6U1  = V3V4 


(b)  U and  U can  be  matched  as  follows: 


curve  segments  u^.^u^  = w4...Wg 


U5uo  " wr*,w3 


angle  primitive  u.u.  E w,w, 
o i 3 4 


U4U5  = W8W1 


(c)  V and  W can  be  matched  as  follows: 


I 
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curve  segments  v^v^  - w^Wg 


Vj»..Vy  - V^axWj 


r 

angle  primitives  v^v^  = WgW^ 


v?Vi  fc  w5...w? 


In  step  (2),  each  boundary  is  decomposed  with  respect  to  the  starting 
points  found  above.  So  we  have  3 S-production  rules  as  follows: 


SL  - FI  A 1 F2  A2  N 1 A3 


SL  - N1  A3  FI  A 1 F2  A2 


SL  - F2  A2  N1  A3  FI  A1 


where  F's  and  N's  are  curve  segments  and  A:s  are  angle  primitives. 
FI  indicates  v.v.  in  V,  u.  in  U and  w,wD  in  W. 


in 

V, 

U4 

i n 

U 

and  w^ 

W8 

in 

U. 

i V, 

■ u. 

5U6 

in 

u 

and  w^ 

• • • 

w3 

i n 

U. 

v7 

in 

V, 

U1 

•Uj  in 

U and 

W4 

w^  in  W. 

i n 

V, 

C 

c 

5 

in 

U and 

WgW 

1 

in 

U. 

in 

V, 

u6u 

1 

in 

U and 

w^w 

4 

i n 

W. 

i n 

V, 

u^u 

4 

in 

U and 

W5* 

• • 

w7 

in  W. 

In  step  (3),  the  N1  in  each  boundary  chain  is  further  decomposed  accord- 
ing to  the  Decomposition  Rule. 

v^...v7  is  first  decomposed  into  v^v^  and  v^v^.  A corresponding 


production  rule  is  obtained. 


: 

* *.  - 





i 
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N1  ♦ F3  A4  F4 


where  F3  indicates  v^v^,  F4  indicates  v^v^  and  A4  indicates  v^v^. 

is  then  decomposed  into  u^  and  Since  their  descriptors 

are  very  close  to  those  of  F3  and  F4,  the  same  N1  production  is  used  and 

F3  and  F4  also  indicate  u^  and  i^u^  respectively.  The  same  situation 

occurs  when  w^w^  is  decomposed.  Therefore, 

F3  indicates  v^v^  in  V,  u1  in  U and  w^  in  W. 

F4  indicates  v^v^  in  V,  u^u^  in  U and  w,.  in  W. 

A4  indicates  v_b.  in  V,  u.u-  in  U and  w.wc  in  W. 

3 0 1 c 4 3 

In  step  (5),  all  the  attributes  of  the  descriptors  associated  with  the 
primitives  and  nonterminals  can  be  computed  by  averaging  the  descriptors 
corresponding  to  the  same  symbol.  Therefore,  a shape  grammar  of  L is 
obtained  as  follows: 

Gl  = £V,T,P,Sl> 

V = <sl,ni> 

T = -CF1,  F2,  F3,  F4,  A1,  A2,  A3,  A4> 

P:  SL  - FI  A1  F2  A2  N1  A3 
SL  - N1  A3  FI  A1  F2  A2 

SL  ♦ F2  A2  N1  A3  FI  A1 

N1  ♦ F3  A4  F4 


6.3  Experimental  Results  and  Discussion 


Experiment  6.1: 


To  demonstrate  the  proposed  ALA,  two  views  of  F102,A  and  B52,A  are 
selected.  The  F102,A  has  4 training  patterns  in  one  training  set.  The 
B52,A  has  6 training  patterns  in  two  training  sets,  3 in  each  set.  All 
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ten  training  patterns  can  be  found  in  group  L in  Appendix  B.  The  reason 
for  separating  the  6 training  patterns  of  B52,A  into  2 sets  is  to  reduce 
the  number  of  matching  pairs  from  15  to  6.  The  grammars  resulting  from 
different  training  set  arrangements  may  be  differert  in  compactness. 
The  more  patterns  there  are  in  one  set,  and  the  less  training  sets  there 
are,  the  more  compact  the  grammar  will  be. 

To  avoid  incorrect  matching  by  the  MATCH  routine,  we  used  rather 
strict  thresholds  for  the  recognition  function.  The  corresponding 
recognition  region  in  the  transformed  attribute  space  is  about  the  same 
size  as  the  distribution  of  the  descriptors  studied  in  Section  3.5.  We 
set  corner  tolerance  kc  = 6 and  thresholds  = 0.15,  k^  = 0.1, 

k^  = 0.2,  and  k^  =>  0.1  (see  Section  4.4).  These  thresholds  caused  two 
pairs  of  B52,A  training  shapes  to  not  match.  The  output  GLg,^  a and 
GLpiQ2  A are  shown  in  the  following  pages.  The  corresponding  segmenta- 
tion will  not  be  shown  here,  because  it  takes  a lot  of  programming  ef- 
fort to  add  the  plotting  facilities  to  ALA  to  track  each  decomposition 
and  each  primitive  assignment.  We  used  GLg^  A and  GLp.^  a t0  reco9~ 
ni ze  the  10  training  samples  shown  in  group  L and  another  10  testing 
samples  shown  in  group  T.  (See  Appendix  B.)  All  40  tests,  20  patterns 
for  each  grammar,  are  correct.  The  recognition  functions  we  used  in  the 
recognition  experiment  are  the  same  as  those  we  used  in  Experiment  4.1. 
Also  the  recognition  time  is  about  the  same  as  that  in  Experiment  4.1. 

The  ALA  was  implemented  in  FORTRAN  on  F>urdue  University's  C0C  6500 
computer.  The  processing  time  was  3.9  seconds  for  inferring  GLg^  ^ 

3.4  seconds  for  inferring  GLp^^  a" 

GLF102,A  = (VF102,A'  TF102,A'  PF102,A'  SF102,A) 


\ 


and 
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VF102,A  = <SF102,A'  NI'S>'  1 = 1/2,. ..,3 

TF102,A  = <FJ'S/AK'S>,  J = 1,..  .,14,  K = 1,...,13 
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The  design  of  ALA  in  Section  6.2  is  based  upon  the  training  pat- 
terns of  one  vie*.  The  shape  grammar  is  constructed  without  knowing 
other  classes  of  patterns.  If  the  classes  of  patterns  to  be  discrim- 
inated are  significantly  different  in  structure,  the  ALA  may  infer  gram- 
mars which  are  effective  in  discrimination.  If  the  different  classes  of 
patterns  are  only  slightly  different  in  detail  or  in  certain  parts,  the 
grammars  inferred  by  ALA  may  not  be  discriminative  enough  because  the 
ALA  does  not  know  the  differences  between  the  classes.  For  example, 
MIG-15, U and  MIG-15, V (see  Figure  4.13)  differ  in  the  width  at  the  end 
of  the  fuselage.  The  ALA  may  infer  v based  on  the  MIG-1 5, V 

shapes,  and  GLMT.  , c ..on  the  MIG-1 5, U shapes.  The  ALA  does  not  know 
nib"' I j,  U 

where  the  significant  difference  should  be.  Therefore,  its  output  gram- 
mars may  not  be  effective  in  discriminating  MIG-15, U's  and  MIG-15, V's. 
Of  course,  the  ALA  can  be  improved.  But  the  present  ALA  implementation 
requires  1200  statements  in  FORTRAN.  Any  further  improvement  will  in- 
crease the  complexity  and  length  of  the  program. 

To  obtain  discriminative  and  compact  grammars  for  shape  classifica- 
tion without  too  much  human  effort,  we  developed  an  Interactive  Learning 
Algorithm,  ILA. 

6^  The  Interactive  Inference  of  Shape  Grammars 

Manual  inference  of  shape  grammars  requires  more  human  effort  than 
automatic  inference,  while  the  grammars  obtained  manually  are  more 
discriminative  than  those  obtained  automatically.  A man-machine  in- 
teractive procedure  may  improve  performance  yet  reduce  human  effort. 

There  are  many  different  versions  of  interactive  procedures  due  to 
the  differing  extents  of  interaction.  In  this  section  we  will  discuss 
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one  developed  procedure  which  is  implemented  in  FORTRAN  dt  Purdue's  Pat- 
tern Processing  and  Advanced  Automation  Laboratory. 

The  interactive  learning  algorithm,  ILA,  processes  one  model  pat- 
tern at  a time.  The  algorithm  is  first  fed  a shape  pattern  which  may  be 
obtained  through  the  preprocessing  described  in  Chapter  4.  The  original 
source  of  this  pattern  may  be  a noisy  analog  picture.  The  human  opera- 
tor is  allowed  to  modify  this  model  pattern  to  reduce  the  noise.  Then 

the  human  operator  may  use  ALA  to  infer  a grammar  or  interactively  as- 

sign the  curve  primitives  and  production  rules.  We  used  the  interactive 
way  in  our  experiments.  The  ILA  will  automatically  check  the  consisten- 
cies or  conflicts  of  the  primitive  assignments.  If  there  are  any  con- 

flicts or  inconsistencies,  these  wrong  primitive  assignments  need  to  be 
corrected  interactively.  The  ILA  will  also  assign  the  angle  primitive 
and  compute  all  the  descriptors  automatically.  The  grammar  inferred  in- 
teractively can  come  close  to  a manually  designed  grammar.  If  the  clas- 
sification performance  is  unsatisfactory,  the  grammar  can  always  be 
modified  again  through  a feed-back  loop. 

Al  gorithm  6^.2:  ILA 

Input:  Training  patterns  and  operator  commands  from  ths  keyboard. 

Out  put : A C F SG 

For  simplicity,  the  details  of  this  algorithm  are  omitted  here,  but 
the  block  diagram  of  this  ILA  is  shown  in  Figure  6.5.  The  (I)'s  mark 
the  blocks  in  which  the  interactive  techniques  are  employed.  The  RAMTEK 
display  is  utilized  to  show  the  primitive  assignments,  pattern  modifica- 
tion, etc. 


IT*. 


1. 
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Input  Grammar 


unsat i sf actory 


(I ) Grammar  I Creation 
or  2 Modi fi cat  ion? 


TZ 


Input  Pattern 


I 


( I ) Pat  tern  Mod  i f icat ion 


I 


(0 


Input  or  Modify 
Curve  Assignment 


I 


(I) 


Input  or  Modify 
Production  Rules 


1.  Check  for  Conflicts 

2.  Assign  Angle  Prim. 

3.  Compute  Descriptors 


Output  Conf 1 icts 


Next  Step? 

| ( I ) I Test  2 another  pattern 
3 Modification 


Output  Grammar 


(I)  Performance  Test 


sat i sfactory 


Figure  6.5  The  Flow-Chart  of  ILA 
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The  context-free  shape  grammars  used  in  Chapter  A are  produced  by 
the  I LA  discussed  here.  The  experiments  in  Chapter  A have  already 
demonstrated  the  effectiveness  of  ILA. 

Since  ILA  requires  human  efforts  and  feedback  from  the  performance 
test,  it  is  difficult  to  make  a good  comparison  between  ILA  and  ALA. 
Besides,  the  ALA  was  implemented  on  a CDC  6500  computer  while  the  ILA 
was  implemented  on  a PDP  11/A5  with  a RAMTEK  display.  However,  we  still 
can  get  some  idea  from  our  experiences  using  ILA  and  ALA.  As  we  men- 
tioned in  Section  3. A,  the  computing  time  for  inferring  GLg,^  A or 
6LF102  A was  about  3.7  seconds  on  the  COC  6500.  Of  course,  the  comput- 
ing time  will  increase  with  the  number  of  training  samples  and  with  the 


number  of  vectors  in  each  training  sample.  When  we  used  ILA  to  infer 
the  grammars  on  the  PDP  11/A5,  about  2 hours  of  time  was  needed  for  each 
grammar.  This  2 hour  time  includes  human  observation,  design  and  man- 
machine  interaction.  Of  course,  the  central  processing  unit  of  the  com- 
puter was  idle  and  waiting  for  input  from  the  keyboard  during  most  of 
this  time.  Needless  to  say,  the  inferring  time  of  ILA  is  related  to  hu- 
man skill  and  to  the  smoothness  of  the  input  model  patterns. 

6. *1  Tbe  CFSG  to  FSSG  Conversion 

The  attributed  grammars  are  used  for  shape  description  and  recogni- 
tion because  (1)  the  production  rules  can  describe  the  structure,  (2) 
the  primitives  describe  the  boundary  details  with  attributes,  and  (3) 
the  nonterminals  can  describe  portions  of  the  shape  with  attributes.  If 
the  primitives  of  a shape  pattern  are  defined,  the  boundary  details  are 
described  by  the  attributes  and  the  production  rules  describe  the  struc- 
ture in  either  context-free  form  or  finite-state  form.  The  nonterminals 
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in  CFSG  can  properly  be  used  to  represent  some  semantically  meaningful 
portion  of  the  shape  and  the  associated  attributes  can  usefully 
represent  the  semantic  information  for  discrimination.  But  the  nonter- 
minal in  FSSG  represents  the  remaining  part  of  the  shape  after  some  con- 
secutive primitives  are  recognized.  There  is  no  flexibility  of  using 
nonterminals  to  represent  some  perceptional ly  meaningful  portions  of  the 
shape.  Experiment  4.2  shows  the  discriminative  capability  of  nontermi- 
nals. Therefore,  if  certain  portions  of  the  shape  need  to  be  described 
by  attributes  or  represented  by  nonterminals,  the  CFSG  is  preferred  over 
the  FSSG. 

In  some  cases,  the  differences  in  structure  and  boundary  details 
are  sufficient  in  discriminating  different  classes,  e.g.  in  experiment 
4.1,  both  CFSG  and  FSSG  can  be  used  for  recognition.  Since  the  finite 
automaton  is  usually  faster  in  recognizing  shapes  (see  Section  4.5),  the 
inference  of  FSSG  is  worthwhile  to  study.  Because  we  already  have  algo- 
rithms for  inferring  CFSG,  we  would  rather  study  the  CFSG  to  FSSG 
conversion  instead  of  a FSSG  inference  algorithm  to  obtain  FSSG.  Be- 
sides, the  converted  FSSG  has  the  same  primitive  set  as  the  CFSG.  This 
makes  comparison  of  the  performance  between  CFSG  and  FSSG  much  more  con- 
venient. 

According  to  the  theory  of  formal  languages,  for  an  arbitrary  CFG 
G,  it  is  undecidable  whether  L(G)  is  regular  [43].  In  other  words, 
there  is  no  way  to  say  whether  there  exists  a FSG  F,  such  that  L(F)  = 
L(G) . L(F)  is  the  language  generated  by  F.  But  we  may  be  able  to  find 
F's  for  a subset  of  CFG  G's  because  of  the  following  lemma  in  [433. 
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Lemma  6 .2:  C43] 

L(G)  is  not  regular  if  and  only  if  all  grammars  that  generate  L are 
self-embedding,  where  "self-embedding"  is  defined  as  follows: 

Defi ni tion  6.3:  C433 

A CFG  G = (N,T,P,S)  is  seif-embedding  if  A ♦ uAv  for  some  u and  v 
in  T+  (neither  u nor  v can  be  empty). 

Lemma  6.1  implies  the  following  lemma. 

Lemma  6_.2: 

If  A CFG  G is  not  self-embedding,  then  L(G)  is  regular. 

To  further  study  the  conversion  we  need  to  know  the  Greibach  normal 
form  of  CFG. 

Defi ni tion  6_.4_:  C43D 

A CFG  G is  said  to  be  in  Greibach  normal  form  (GNF)  if  G is  x-free 
and  each  production  is  of  the  form  A -*•  aa  with  ae T,  acN*.  x indicates 
empty  and  x-free  means  G has  no  empty  productions. 

Theorem  6_.J_:  C43] 

If  L is  a CFL,  then  L = L(G)  for  some  G in  GNF. 

The  proof  of  this  theorem  can  be  found  in  C43D.  There  are  several 
algorithms  in  [43]  which  convert  an  arbitrary  CFG  into  its  GNF.  We  are 
interested  in  determining  whether  or  not  a CrG  in  GNF  is  self-embedding. 

If  not,  we  can  try  to  construct  a FSG  F such  that  L(F)  = L(G) . Other- 
wise, the  existence  of  such  a F is  unknown.  We  have  developed  Algorithm 
6.3  for  detecting  the  self-embedding  property  of  a general  CFG  in  GNF. 


For  a better  understanding  of  Algorithm  6.3  and  its  proof,  the 
meaning  of  functions  P,  F,  and  S are  explained  as  follows.  F(p,A)  = 1 
means  that  symbol  A can  be  found  in  production  rule  p with  a nonempty 
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succeeding  string  y.  P<A)  containing  (i,X)  indicates  that  X can  be 
derived  from  A with  the  lest  production  rule  i.  S(A,B)  = 1 implies  that 
A * aBy,  with  |y  | >0.  Therefore,  if  S(X,X)  = 1 is  found,  it  implies 
that  X * a Xy  can  be  found  with  |y|  > 0 and  |a  | > 0 because  G is  in  GNF. 
Hence,  G is  self-embedding. 

Al gorithm  6^3: 

Input:  CFG  G = <N,T,P,S)  in  GNF. 

Output:  (1)  Yes:  G is  self-embedding 


(2)  No:  G is  not  self-embedding. 


Method: 


(1)  Number  the  production  rules  p = 1 through  n 

(2)  Let  P( A)  = <)>  (empty  set)  He  N 

F(p,A)  = 0 (Logical  False)  V A e N,  1 < p < n 
S(X, A)  = 0 (Logical  False)  V X,A  t N 

(3)  For  al l rules 

★ 

if  p:A  + aB  Xy  A,  X e N,  a e T,  B ,y  e N 


then  P( A)  = P(A)  U C(p,X)> 


S(A, X)  = F(p,X)  = 


F(p,x)  +1  if  I y I > 0 
F(p,x)  +0  if  |y|  = 0 


where  + indicates  Logical  'OR' 

(4)  Repeat  this  step  until  there  is  no  change  in  the  P's,  F's,  and  S's 
If  ( i, Y)  c P( A) 

then  V (j,B)  e P(Y) 

P( A)  = P( A)  U C(j,B)> 

S(  A, B)  = F ( j ,B)  + S(A,  Y) 

(5)  If  S(X,X)  = 1 for  some  X c N 
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then  terminate  with  "Yes", 
otherwise  "No". 

Proof: 

To  detect  the  sel  f-embeddirig  property,  we  have  to  find  out  whether 
there  is  A * uAv  with  v 4 

(I)  To  prove:  If  6 is  self-embedding,  there  must  be  a production 
k : X **■  dB  B6,  and  B-»  n ' Aa  ' deT,  6eN+,  B,n'/(»,a'eN  , such  that  A-*  n Xa 
♦ ndBBfia  -*•  ndBn'Aa'Ba,  ncCT  UN).  Step  (3)  makes  (k,B)  c PCX), 
F(k,B)  = 1,  and  S(X,B)  = 1.  The  repeat  of  (4)  will  cause  all  the  possi- 
ble X's  derived  from  A to  be  included  in  P(A)  with  some  proauction  i. 
That  is,  there  must  be  C i , X)  e P(A)  for  some  i and  also  ( j , A)  e PCB)  for 
some  j.  Then,  with  step  (4),  P(A)  = P(A)  U -CCk,B)>,  S(A,B)  = S(A,X)  + 
FCk,B)  = 1.  And  since  (j,A)  e PCB),  PCA)  = PCA)  U CCj,A)>,  SCA,A)  = 
S C A, X)  + F(j,A)  = 1.  So,  the  algorithm  will  terminate  with  "Yes". 

C I I)  To  prove:  If  there  is  SCX,X)  = 1 for  some  X e N,  then  there  must 
be  X + uXv,  v 4 This  part  is  similar  but  the  reverse  of  Cl). 

Because  PCX)  has  a finite  number  of  elements,  F's  and  S's,  step  C4) 
will  certainly  terminate  after  a finite  number  of  iterations. 

Q.E.D. 

Algorithm  6.3  is  for  general  CFG's  in  GNF.  Our  CFSG's  are  only  a 
subset  of  the  CFG.  The  CFSG's  for  rigid  objects  are  very  simple  in  syn- 
tax. Because  a shape  within  the  vision  of  an  observer  can  always  be  di- 
vided into  a finite  number  of  simple  curve  segments,  a shape  can  always 
be  syntactically  described  by  an  FSG.  But,  whether  the  nonterminals  of 
an  FSG  can  properly  describe  the  semantically  meaningful  portions  of  a 
shape  is  another  story.  Therefore,  our  CFSG's  for  rigid  objects  can  al- 
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ways  be  converted  into  FSSG's,  but  the  FSSG  may  not  distinguish  well  the 
classes  whose  differences  are  in  some  semantically  meaningful  portions 
of  the  shapes.  Algorithm  6.4  is  developed  for  the  CFG  to  FSG  conver- 
sion. 

At  gorithm  6^.4^:  CFG  to  FSG  Conversion 
Input:  A CFG  G = {N,T,P,S}  which  is  not  self-embedding  and  in  GNF. 
Output:  FSG  F,  L(F)  = L(G) . 

Method: 

(1)  Nz  = {Z^'s>  is  a set  of  new  symbols  not  in  N. 

N2  is  initialized  empty. 

(2)  y A ♦ aBy,  | y | > 0 

If  Z.  ♦ By  for  some  e Nz, 

then  replace  A ♦ aBy  by  A + aZ^ 

Otherwise  for  some  Z.  i N 

J 2 

(#  * azj 

replace  A ♦ aBy  by  j _ 

(zj  + 

and  N = N U -CZ.> 
z z ) 

N = N U N . 

z 

(3)  If  there  are  some  production  rules  not 

in  the  form  of  A ♦ aX,  or  A*  a, 
then  goto  (4) . 

Otherwise  terminate. 

(4)  Order  < on  N. 

If  A ♦ Be,  then  A < B, 

We  obtain  A1  < < Ap  |N|  > n 

(5)  i = n-1 


' 


y 


(6)  It  i = 0 goto  (2), 


Otherwise  for  * Aj  a , j > i,  Aj  s1  | | em 

replace  it  by 


(7)  i = i-1,  goto  (6). 

This  algorithm  has  been  implemented  in  FORTRAN  on  F'urdue's  CDC  6500 
computer.  All  the  FSSG  F's  used  in  Experiment  4.1  and  4.2  were  convert- 
ed from  the  CFSG  G's  by  using  this  algorithm.  The  success  of  Experi- 
ments 4.1  and  4.2  have  shown  the  effectiveness  of  this  algorithm  for 
converting  CFSG's  of  rigid  objects. 

Since  the  ALA  can  infer  CFSG's  and  Algorithm  6.4  can  convert  the 
CFSG's  to  FSSG's,  the  two  algorithms  can  be  cascaded  to  obtain  FSSG's 
from  the  noisy  training  patterns. 

The  output  FSSG  from  Algorithm  6.4  has  not  been  minimized  yet. 
Since  many  authors  C85,863  have  studied  machine  minimization,  we  are  not 
going  to  discuss  it  in  detail  here.  A minimized  FSSG  will  be  economical 
in  both  storage  and  processing  time. 


\ ■ 
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CHAPTER  7 

RESULTS,  CONCLUSIONS,  AND  SUGGESTIONS 


: 

t - 


7.1_  Summary  of  Resul  ts  and  Conclusions 

A syntactic  method  for  general  shape  recognition  is  proposed  and 
studied.  This  method  utilizes  attributed  grammars  to  describe  and 
recognize  shapes.  The  attributed  grammar  is  a powerful  and  convenient 
tool  when  designing  a highly  intelligent  descriptive  method  because  it 
utilizes  symbolic  productions  and  numeric  attributes  to  describe  syntac- 
tic and  semantic  information.  We  have  successfully  used  the  attributed 
grammars  to  solve  a class  of  general  shape  description  and  recognition 
problems.  The  primitive  extraction  has  been  a difficult  problem  when 
applying  the  syntactic  method  to  general  nonsymbol ic  patterns.  The  PEE 
idea  which  has  been  successful  in  shape  recognition  provides  a systemat- 
ic and  accurate  way  to  extract  primitives.  The  experimental  results  in 
Chapter  4 have  shown  the  feasibilities  of  this  PEE  idea  and  its  advan- 
tages in  both  recognition  accuracy  and  computational  efficiency. 

Since  the  proposed  syntactic  method  is  not  designed  for  any  partic- 
ular application,  it  has  the  potential  of  solving  general  shape  recogni- 
tion problems  without  requiring  context-sensitive  grammars.  By  employ- 
ing the  error-correcting  techniques,  the  ECPEE  parser  can  recognize  bad- 
ly distorted  shapes  which  cannot  be  handled  by  other  existing  shape 


recognition  methods.  Taking  advantage  of  the  available  attributes  which 
describe  the  semantic  information  of  the  shape,  ou»-  error-correcting  PEE 
parser  can  obtain  an  error-weight  very  close  to  a human's  perception  of 
distortion.  Just  like  other  error  correcting  algorithms,  the  ECPEE 
parser  unfortunately  has  the  disadvantage  of  costly  computation. 

The  grammatical  inference  of  our  particular  shape  grammars  was  stu- 
died and  reported  in  Chapter  6.  The  shape  grammar  can  be  obtained  from 
noisy  vector  chains  automatically  using  the  ALA,  an  automatic  learning 
algorithm,  or  interactively  using  the  ILA,  an  interactive  learning  algo- 
rithm. Our  experiments  have  shown  that  both  algorithms  can  obtain  gram- 
mars which  can  be  used  to  recognize  shapes  satisfactorily. 

7_.2  Suggestions  for  Further  Research 
Although  results  from  syntactic  shape  recognition  using  attributed 
grammars  appear  to  be  quite  satisfactory,  there  are  some  aspects  about 
which  we  need  to  know  more.  Let  us  summarize  the  topics  for  further  in- 
vestigation in  the  following  paragraphs. 

(1)  The  recognition  functions  and  the  generalized  recognition 
functions  can  be  improved.  The  0-1  recognition  functions, 
described  in  Assumptions  4.1  and  4.2,  are  derived  from  the 
noise  study  of  descriptors.  Further  investigations  into 
noisy  primitives,  gradual  shape  changes,  and  human  percep- 
tion of  shape  similarity  and  dissimilarity  may  help  us  ob- 
tain more  reasonable  recognition  functions  and  generalized 
recognition  functions. 
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(2)  The  grammatical  inference  algorithms  for  shape  grammars 


discussed  in  Chapter  6 car.  be  improved.  Also,  the  combi- 


nation of  several  subgrammars  into  a single  grammar  may  be 
worth  an  extensive  study  for  very  complex  objects. 

(3)  The  parsing  efficiency  of  an  ECPEE  parser  or  a GECPEE 
parser  can  be  improved.  Further  research  in  processing 
vector  strings  using  sequential  classification  will  also 
be  frui  tful  . 

In  addition,  there  are  some  other  related  interesting  topics.  They 
are  vector  pattern  generation  and  moving  object  identification.  He  will 
discuss  them  in  the  following  two  subsections. 

7.2.1  Pattern  Generation  and  Error  Correction 

The  shape  grammars  studied  so  far  are  for  recognition  purposes. 
But,  to  generate  patterns  is  one  of  the  capabilities  of  a pattern  gram- 
mar C38,93J.  Our  shape  grammars  can  surely  generate  patterns  which  are 
represented  in  terms  of  primitives  and  their  associated  attributes.  In 
other  words,  the  patterns  generated  are  strings  of  primitive  symbols  and 
descriptors.  In  this  section,  we  will  discuss  the  feasibility  of  gen- 
erating vector  patterns  or  boundary  vector  chains. 

To  generate  vector  chains  from  the  descriptors,  the  descriptors 
describing  the  primitives  must  be  unique.  An  angle  primitive,  the  con- 
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be  unique.  For  example,  |t|/L  = 1,  i.e.,  all  the  curve  primitives  have 
to  be  straight  lines.  Then,  it  becomes  possible  to  map  the  descriptors 
to  the  vectors.  That  is,  if  we  restrict  the  semantic  information  to  a 
certain  type  of  curve  segment  which  can  be  uniquely  described  by  the  at- 
tributes, then  the  vector  chains  can  be  generated  by  the  shape  grammars. 

Unfortunately,  a shape  grammar  which  has  restricted  curve  primi- 
tives and  which  can  generate  the  vector  chain  is  not  generally  effective 
for  recognition.  Suppose  that  we  use  only  straight  lines  as  curve  prim- 
itives. If  the  true  boundary  is  a continuous  and  smooth  curve,  it  can 
be  approximated  by  a series  of  straight  lines.  With  different  sizes  and 
rotations,  the  boundary  can  be  approximated  by  different  number  of 
straight  lines  of  different  lengths  and  connecting  angles.  It  may  be 
difficult  to  recognize  the  primitives,  the  straight  lines,  and  the  an- 
gles. Therefore,  a shape  grammar  which  is  effective  in  recognizing  vec- 
tor patterns  and  in  generating  symbolic  patterns  with  attributes  may  not 
be  as  effective  in  generating  vector  patterns.  A shape  grammar,  which 
is  effective  in  generating  vector  patterns,  may  not  be  as  effective  in 
recognition. 

In  case  we  need  a grammar  for  both  generation  and  recognition,  we 
would  suggest  having  two  levels  of  primitive  sets.  The  higher  level 
primitive  set  is  for  recognition,  and  the  lower  level  primitive  set  is 
for  generation.  To  avoid  confusion,  such  a grammar  can  be  considered  as 
a dual  grammar  0 , a combination  of  a recognition  shape  grammar  Gt  = 
(V^,  Tl'  Pl'  Sl^  ancl  3 9enerat'’on  shape  grammar  Ht  = (Ut,  E , Qt,  . 

Since  i s for  generation,  and  the  primitives  in  are  uniquely  de- 
fined by  associated  descriptors,  the  attribute  rules  associated  with 


production  rules  in  Q are  useless  and  hence  omitted.  G,  is  the  same  as 


the  general  form  in  Section  3.4.  i is  the  label  of  the  shape.  Q has 


F-productions  in  addition  to  S-production  and  N-productions.  E 


curve  primitives  e's  and  angle  primitives  a's  which  describe  the  connec 


tions  between  consecutive  e's,  in  addition  to  A’s.  The  F-productions 


may  not  be  obtained  from  the  descriptors,  because  the  mapping  from  the 


descriptor  to  the  vector  pattern  is  generally  not  unique.  Thus,  H can 


not  generally  be  deduced  from  G 


(G  ,H  ) is  a dual  shape  grammar  for  both  recognition  and  gen^ 


S„  - (XA)  XAf Answer  > 


D(N)  * (D(X)®)  D(X) 


Exampl e 7.1 


Suppose  we  use  straight  lines  as  e's,  a continuous  and  smooth  boun^ 


dary  of  a "K"  shape  (see  Figure  7.1)  can  be  recognized  and  generated  by 
following  dual  grammar  D^.. 
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F6  ♦ elO  a6  ell  a7  e12  a8  e13 

As  we  can  see,  the  S-  and  N-productions  in  Q£  correspond  to  those 
in  P£,  and  F-productions  in  Q£  produce  e's  and  a's  for  all  F in  G^.  We 
can  use  G.  for  recognition,  H for  generation,  and  both  G and  H for 
error  correction.  As  mentioned  before,  the  error-correcting  techniques 
can  be  applied  in  the  parsing  to  recognize  partially  distorted  shape 
patterns.  Once  the  distorted  pattern  is  recognized  using  P , the 
corresponding  productions  in  Q£  can  be  applied  to  reconstruct  the  dis- 
torted portion  of  the  shape. 

7.2.2  Moving  Object  Identification 

Moving  object  identification  is  another  interesting  research  topic 
[94,95,96],  especially  when  different  objects  have  similar  shapes  at 
certain  viewing  angles.  For  instance,  some  airplane  models  have  very 
similar  front  views  but  different  top  views.  If  an  unknown  airplane  is 
flying  in  the  sky,  we  would  like  to  classify  it  with  a minimum  amount  of 
time  and  effort.  If  the  view  observed  is  the  front  view,  then  one  more 
view  may  be  necessary  to  resolve  any  ambiguity.  Therefore,  two  in- 
teresting questions  are  that:  (1)  is  one  more  view  needed?  (2)  what  is 
the  next  best  view? 

Let  us  suppose  that  each  object  has  n^  views.  We  refer  a view 
to  a set  of  probable  shapes  observed  at  a certain  viewing  angle.  Due  to 
noise,  resolution,  or  some  other  reasons  such  as  removable  parts,  there 
may  be  several  appearances,  P^'s  for  a viewing  angle  Vj  of  the  object 
C.j . Each  view  can  be  described  by  a subgrammar  G(C-,\M,  j = 1,...,n-. 
It  was  mentioned  in  Section  4.6  that  subgrammars  for  one  class  can  be 
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combined  to  obtain  a single  grammar  G(C^),  or  can  be  treated  indepen- 
dently. In  this  section,  all  the  subgrammars  will  be  considered  in- 
dependently, though  they  can  be  combined  into  one  single  grammar  6(0^) 
with  different  label  V^'s.  With  dual  shape  grammars,  P^'s  can  be  gen- 
erated by  G(C^,Vj).  Let  us  first  find  the  relationships  between  views 
by  classifying  P.^'s  with  all  the  subgrammars. 

Definition  7_.2_ 

M(c.xV.)  = «C.,V.)|  if  any  of  P..'s  is  accepted  by  G(C  ,V.)>. 

IK1J  ij  IK 

The  relational  set  M's  have  the  following  interesting  properties: 

(1)  (C^V.)  e IKC^V.) 

C2)  If  (C^V.)  e M(Ct,Vk)  and  (Ct,Vk>  i M(C.,V-),  then  only  GfC^Vj)  is 
sufficient  to  distinguish  (C-,V.)  from  (C  ,Vt).  This  property  has 

I j f,  K 

been  used  in  constructing  the  classification  tree.  (See  Section 
3.7  and  Experiment  4.2.) 

(3)  If  (C.,Vj)  i M(Ct,Vk>,  and  (Ct,Vk)  i M(C.,Vj)  then  the  two  views 
are  clearly  different. 

(4)  If  (C.,V.)  e M(C,V.)  and  (C,V.)  e M(C.,VJ,  then  the  two  views 

ij  £K  £K  lj 

are  similar. 

Let  us  define  the  intersection  I’s  as  follows. 

Definition  1_.Z_ 

ic<ci,vj> , (Ct,vk)]  = MCC^V.)  n M(ct,vk) 


IC(Ci ,V J ,(Ci,Vk>]  D <(Ci ,V-) ,(Ct,Vk)>  implies  that  both  views 
(C.j,Vj)  and  ^Ct/Vk)  can  be  recognized  by  G(C^,Vj)  or  by  G(Cl,Vk).  If 
i t i and  the  viewing  angle  of  the  object  is  unknown,  or  known  as  Vk' 
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then  another  view  may  be  necessary  to  identify  the  object.  If  the  view- 
ing angle  of  the  object  can  oe  obtained,  the  next  best  view  would  be  V. 

b' 

at  which  ICCC^V^),  (C  ,1^)3  has  the  minimum  number  of  elements  in  the 
set. 

If  we  are  classifying  an  unknown  moving  object  against  two  classes 
C-  and  Cj,  we  may  assume  that  the  viewing  angle  function  of  time,  V(t>, 
can  be  obtained  by  estimating  the  velocity  and  trajectory  by  other  means 
such  as  radar.  The  best  picture  shooting  time  for  classification  will 
be  tg  at  which 

N = |lC(C.,V(t0)),(Cj,V(t0))D| 

= min  |lC(C.,V(t)),(C.,V(t))3| 
t 3 

If  N = 0,  the  object  can  be  recognized  without  the  next  shot.  If  N > 0, 
one  more  shot  may  be  necessary. 

The  union  of  the  intersection  of  all  angle  views  can  be  defined  as 
fol lows. 

Definition  7_.f^ 

U <C.,C4)  = U K<C.,Vj),(Cjl,Vk)> 
j 

If  II  (C.j,Cj)  = $,  then  the  two  objects  differ  clearly  at  any  view- 
ing angle.  Only  one  view  is  sufficient  to  distinguish  them. 

Definitions  7.3  and  7.4  for  two-class  problems  can  be  generalized 

t 

to  multi-class  problems.  | 


contains  the  minimum  nunber  of  C's.  Let 


Then,  the  best  second  view  will  be 


us  suppose  they  are  C 


V,)]  contains  the  minimum  nunber 


of  C 's.  The  view  can  be  selected  beforehand  from  the  relational  sets 


For  moving  objects,  the  problem  is  a little  more  complicated.  We  have 


tion.  If  the  I function  of  V.  contains  more  than  one  class,  then  we 


have  to  make  sure  that  the  subsequent  views  selected  can  be  obtained 


For  this  reason,  V„  is  the  first  view  of  the  best  view  series 


have  the  minimum  nunber  of  views  or  the  views  withir.  the  most  minimal 


time  with  respect  to  a given  velocity  and  a given  trajectory 
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APPENDIX  A 


MODEL  PREPARATION,  PICTURE  TAKING,  AND  DIGITIZATION 

To  demonstrate  that  our  method  can  distinguish  shapes  by  structural 
differences  as  well  as  by  tiny  boundary  differences,  we  selected  four 
airplane  models.  They  are  F102,  B52,  F86,  and  MIG-15.  F102  and  B52 
have  completely  different  shape  structures  from  any  angle  view.  F86  and 
MIG-15  are  very  similar  in  shape  structure  in  most  of  the  angle  views, 
but  slightly  different  in  boundary  details. 

For  demonstrative  purposes,  we  have  made  the  following  assumptions: 
(1>  the  picture  can  be  taken  at  any  arbitrary  viewing  angle,  (2)  the 
picture  taking  is  fast  enough  that  the  shape  in  the  image  looks  station- 
ary, and  (3)  the  pictures  have  relatively  low  background  noise.  The 
first  assumption  indicates  that  no  particular  angle  view  can  be  hidden. 
To  satisfy  this  assumption,  we  cannot  use  any  visible  support  for  the 
models.  Dudani  [6],  in  his  experiment  of  moment  invariants,  used  an  ap- 
paratus to  hold  the  airplane  in  front  of  the  camera.  The  apparatus  con- 
trolled the  viewing  angle.  But  it  blocked  a portion  of  the  aircraft 
from  the  camera  view  in  a certain  range  of  viewing  angles.  He  used  such 
an  apparatus  because  he  had  to  take  pictures  at  every  5°  to  collect 
enough  training  samples.  In  our  experiment,  we  do  not  need  to  control 
the  exact  angle  of  the  view.  Therefore,  we  tied  three  very  thin  white 
threads  to  the  airplane  at  three  widely  spaced  and  balanced  positions. 
These  three  threads  were  then  hung  to  the  ceiling.  Any  arbitrary  angle 
could  be  obtained  by  changing  the  lengths  of  the  three  threads  and  the 
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position  of  the  camera. 

The  second  assumption  is  reasonable  but  difficult  to  directly  ob- 
tain from  our  TV  camera.  After  we  hung  the  airplane  at  a desired  angle, 
it  took  at  least  15  minutes  to  decrease  the  pendulum  motion  to  where  it 
was  almost  unnoticable.  But  it  still  kept  moving  back  and  forth  slowly. 
The  digitization  of  a 200x200  picture  through  the  TV  camera  took  about 
10  seconds.  If  we  took  the  digital  picture  directly  from  the  TV  camera, 
the  small  pendulum  motion  would  distort  the  picture  so  much  that  even 
man  would  not  know  what  was  in  the  picture.  But,  this  assumption  is  not 
difficult  to  satisfy,  if  we  first  take  the  analog  picture  through  an  or- 
dinary camera  using  a 1/10  second  or  faster  exposure  time,  and  then  di- 
gitize the  analog  picture  via  the  TV  camera. 

The  third  assumption  simply  reduces  the  background  noise  to  an  ex- 
tent that  it  is  not  too  difficult  to  find  the  boundary.  We  satisfied 
this  assumption  by  increasing  the  contrast  between  object  and  background 
and  adjusting  the  illumination.  We  used  a paper  board  covered  with  a 
large  sheet  of  white  paper  as  the  background.  And  we  painted  the  air- 
plane models  with  a flat  black  color  spray.  The  rel ative  distance 
between  the  airplane  and  the  background  is  adjustable. 

The  first  part  of  our  experiment  can  be  summarized  in  the  following 
steps: 

1.  Collect  the  models. 

2.  Prepare  the  white  background.  Paint  the  models  black.  Tie  the 
models  properly  with  white  threads. 
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3.  Hang  up  the  model  at  a desired  position  and  angle. 

4.  Adjust  the  relative  position  of  the  background  and  the  illumination 
to  obtain  a good  contrast  without  strong  reflections  and  dark 
shades. 

5.  Adjust  the  camera  to  the  proper  position  and  take  pictures. 

6.  Attach  the  digital  picture  to  a wall  in  front  of  the  TV  camera. 

7.  Adjust  illumination  and  digitize. 

The  collected  four  airplane  models  were  painted  with  a flat  black 
color  to  avoid  reflection.  They  are  shown  in  Figure  4.1.  The  B52,  F86, 
and  MIG-15  were  tied  with  white  threads  at  the  two  wings  and  the  tail, 
while  the  F102  was  tied  at  the  two  wings  and  the  nose.  These  threads 

may  be  visible  in  the  analog  picture,  but  will  not  be  seen  in  the  digi- 

tal picture  because  of  digitization  resolution.  Figure  A.1  shows  the 
set-up  for  taking  analog  pictures.  The  set-up  for  digitization  is 
simpler  and  is  shown  in  Figure  A. 2.  The  whole  experiment  was  designed 
to  be  movable  and  was  set  up  in  the  Laboratory  of  Pattern  Processing  and 
Advanced  Automation,  directed  by  Professor  K.  S.  Fu.  The  digitization 
process  was  controlled  interactively  through  a PDP  11/45  computer  in  the 
laboratory.  In  order  to  get  an  adequate  resolution  with  minimal  picture 

size  to  save  storage,  we  did  not  restrict  ourselves  to  a fixed  picture 

size.  Before  digitization,  we  adjusted  the  relative  distance  and  focus 
the  TV  camera  to  obtain  a reasonably  clear  picture  on  a TV,  monitor. 
Then,  we  gave  commands  through  the  computer  to  start  digitizing  the  pic- 
ture and  to  save  the  digital  picture  on  a disc  or  a magnetic  tape.  Most 


Figure  A.1  Laboratory  Set-up  for  Picture  Taking 


APPENDIX  B 


This  appendix  contains  20  shapes  obtained  through  procedures 
described  in  Section  4.2  and  4.3.  The  boundary  vector  chains  start  from 
the  bottom  of  the  shape  and  outline  the  objects  counterclockwise.  These 
20  shapes  are  divided  into  two  groups,  L and  T.  The  L group  contains  6 
shapes  of  view  B52,A  and  4 shapes  of  view  F102,A.  They  are  shown  in 
(a)-(j).  The  T group  contains  5 B52,A's  and  5 F102,A's.  They  are  shown 
in  (k)-(t).  All  twenty  shapes  were  used  for  testing  in  Experiment  4.1. 
The  10  shapes  in  L group  were  used  as  training  patterns  and  the  10 
shapes  in  T group  were  used  as  testing  patterns  in  Experiment  6.1. 
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APPENDIX  C 

Sixteen  shapes  obtained  through  procedures  described  in  Sections 
4.2  and  4.3  are  contained  in  this  appendix.  They  are  5 F86,T's,  5 
MIG-15, V's,  3 MIG-15, U's  and  3 MIG-15, T's  shown  in  following  pages  in 
sequence.  They  were  used  in  Experiment  4.2  and  Section  4.6.  The  boun- 
dary vector  chains  start  from  the  bottom  of  the  shape  and  outline  the 
objects  counterclockwise. 
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The  finite-state  shape  grammar  F„.  . converted  from 

06/  T 

free  shape  grammar  Gpg^  j,  is  listed  in  this  appendix, 
used  in  Experiment  4.2. 
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